Pro Membership

Pro Membership is a membership plan created by the author to maintain and update this tutorial. You can get more benefits and services, click to view details.

35. Association Rule Analysis of Shopping Data#

35.1. Introduction#

This challenge will attempt to analyze the basket data using the method of association rule mining to find the frequent item sets and association rules among them.

35.2. Knowledge Points#

Dataset creation
Data preprocessing
Application of the Apriori algorithm
Generation of association rules

35.3. Challenge Introduction#

The challenge provides a supermarket shopping dataset containing 7,500 pieces of data, with each piece being the data of a single shopping cart. The data download address is:

wget -nc https://cdn.aibydoing.com/aibydoing/files/shopping_data.csv

35.4. Challenge Content#

Exercise 35.1

The challenge will use the Apriori algorithm to perform association rule analysis on the dataset. Please find the frequent itemsets that meet the minimum support threshold of 0.05 and calculate the association rules with a minimum confidence threshold of 0.2.

Before the challenge starts, you need to open the terminal and execute the following steps to install the mlxtend machine learning algorithm library.

pip install mlxtend

35.5. Challenge Requirements#

The code needs to be saved in the Code folder and named association.py.
You need to write the code inside def rule(), without modifying the function name.
The challenge requires returning the DataFrames corresponding to the frequent itemsets and association rules in sequence.
During testing, you need to run association.py using python to avoid the situation of missing corresponding modules.

35.6. Example Code#

                          def rule():
    
    ### 补充代码 ###
    
    return frequent_itemsets, association_rules # 返回频繁项集和关联规则对应的 DataFrame

Special attention: The function name rule() must not be modified, and no parameters can be added to rule(). Otherwise, the system will not be able to correctly judge the results.

Solution to Exercise 35.1

                          import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules as rules

def rule():

    df = pd.read_csv("shopping_data.csv", header=None)
    dataset = df.stack().groupby(level=0).apply(list).tolist()

    te = TransactionEncoder()  # Define the model
    te_ary = te.fit_transform(dataset)  # Transform the dataset
    df = pd.DataFrame(te_ary, columns=te.columns_)  # Process the array into a DataFrame

    frequent_itemsets = apriori(df, min_support=0.05, use_colnames=True)
    association_rules = rules(frequent_itemsets, metric="confidence", min_threshold=0.2) # The confidence threshold is 0.1
    
    return frequent_itemsets, association_rules

                        

○ Sharethis article link to your social media, blog, forum, etc. More external links will increase the search engine ranking of this site.

If you find this content helpful, you can buy me a coffee

34. Apriori Association Rule Learning Method

36. Time Series Data Analysis and Processing