cover

39. Agricultural Production Index Modeling Analysis#

39.1. Introduction#

This challenge will attempt to use machine learning models to conduct a modeling analysis of the agricultural production index and predict the future agricultural production index through historical data.

39.2. Key Points#

  • Data preprocessing

  • Data resampling

  • Usage of Prophet

39.3. Challenge Introduction#

In this challenge, we will become familiar with the ARIMA modeling process and methods. And according to the requirements of the experiment, obtain reasonable parameters for the model.

39.4. Challenge Content#

The challenge provides data on China’s agricultural production index from 1952 to 1988, which is aggregated into the data file agriculture.csv. Download link:

wget -nc https://cdn.aibydoing.com/aibydoing/files/agriculture.csv

The dataset consists of two columns. A preview of the first 5 rows is as follows:

year values
0 1952 100.0
1 1953 101.6
2 1954 103.3
3 1955 111.5
4 1956 116.5

Exercise 39.1

The challenge requires performing time series ARIMA modeling on this data file and returning three reasonable parameters for ARIMA(p, d, q).

Before starting the challenge, you need to open the terminal and execute the following steps to install the statsmodels library.

pip install statsmodels

39.5. Challenge Requirements#

  1. The code needs to be saved in the Code folder and named production_index.py.

  2. The code in the following def arima() needs to be completed. At the end of the challenge, the p, d, q parameters of the ARIMA model need to be returned.

  3. When modeling, do not divide the test data and use all the data, and use the AIC solution method.

  4. When testing, please run production_index.py using python to avoid the situation of missing corresponding modules.

39.6. Example Code#

def arima():
    
    ### 补充代码 ###
    
    return p, d, q
{solution-start} chapter04_06_1
:class: dropdown
import pandas as pd
from statsmodels.tsa.stattools import arma_order_select_ic


def arima():
    df = pd.read_csv("agriculture.csv", index_col=0)
    diff = df.diff().dropna()
    p, q = arma_order_select_ic(diff, ic='aic')['aic_min_order']  # AIC
    d = 1
    return p, d, q
{solution-end}

○ Sharethis article link to your social media, blog, forum, etc. More external links will increase the search engine ranking of this site.