Pro Membership

Pro Membership is a membership plan created by the author to maintain and update this tutorial. You can get more benefits and services, click to view details.

39. Agricultural Production Index Modeling Analysis#

39.1. Introduction#

This challenge will attempt to use machine learning models to conduct a modeling analysis of the agricultural production index and predict the future agricultural production index through historical data.

39.2. Key Points#

Data preprocessing
Data resampling
Usage of Prophet

39.3. Challenge Introduction#

In this challenge, we will become familiar with the ARIMA modeling process and methods. And according to the requirements of the experiment, obtain reasonable parameters for the model.

39.4. Challenge Content#

The challenge provides data on China’s agricultural production index from 1952 to 1988, which is aggregated into the data file agriculture.csv. Download link:

wget -nc https://cdn.aibydoing.com/aibydoing/files/agriculture.csv

The dataset consists of two columns. A preview of the first 5 rows is as follows:

	year	values
0	1952	100.0
1	1953	101.6
2	1954	103.3
3	1955	111.5
4	1956	116.5

Exercise 39.1

The challenge requires performing time series ARIMA modeling on this data file and returning three reasonable parameters for ARIMA(p, d, q).

Before starting the challenge, you need to open the terminal and execute the following steps to install the statsmodels library.

pip install statsmodels

39.5. Challenge Requirements#

The code needs to be saved in the Code folder and named production_index.py.
The code in the following def arima() needs to be completed. At the end of the challenge, the p, d, q parameters of the ARIMA model need to be returned.
When modeling, do not divide the test data and use all the data, and use the AIC solution method.
When testing, please run production_index.py using python to avoid the situation of missing corresponding modules.

39.6. Example Code#

                          def arima():
    
    ### 补充代码 ###
    
    return p, d, q

                      {solution-start} chapter04_06_1
:class: dropdown

                      import pandas as pd
from statsmodels.tsa.stattools import arma_order_select_ic


def arima():
    df = pd.read_csv("agriculture.csv", index_col=0)
    diff = df.diff().dropna()
    p, q = arma_order_select_ic(diff, ic='aic')['aic_min_order']  # AIC
    d = 1
    return p, d, q

                    

                      {solution-end}

                    

○ Sharethis article link to your social media, blog, forum, etc. More external links will increase the search engine ranking of this site.

If you find this content helpful, you can buy me a coffee

38. Time Series Data Modeling and Analysis

40. A Review of Automated Machine Learning