cover

14. Implementation and Application of K-Nearest Neighbor Regression Algorithm#

14.1. Introduction#

In the experiment of the K-Nearest Neighbor algorithm, we learned the solution idea for using it to solve classification problems. In fact, the K-Nearest Neighbor can also be used for regression analysis and prediction. In this challenge, you will complete the transformation of the K-Nearest Neighbor algorithm and apply it to regression analysis.

14.2. Key Points#

  • Introduction to K-Nearest Neighbor Regression

  • Implementation of K-Nearest Neighbor Regression

14.3. Content Review#

Recall what we learned in the K-Nearest Neighbor experiment. When using the K-Nearest Neighbor algorithm to complete a classification task, the steps required are:

  • Data Preparation: Through data cleaning and data processing, organize each piece of data into a vector.

  • Calculate Distance: Calculate the distance between the test data and the training data.

  • Find Neighbors: Find the K training data samples that are closest to the test data.

  • Decision Classification: According to the decision rule, obtain the class of the test data from the K neighbors.

https://cdn.aibydoing.com/aibydoing/images/document-uid214893labid7506timestamp1546417333161.gif

Among them, “Decision Classification” is the key step in determining the category of unknown samples. Then, when we apply the K-Nearest Neighbor algorithm to regression prediction, we actually only need to modify this step to a process suitable for regression problems.

  • Classification Problem: Based on the classes of the K neighbors, obtain the class of the unknown sample through majority voting.

  • Regression Problem: Based on the target values of the K neighbors, calculate the average value to obtain the predicted value of the unknown sample.

The illustration of the K-Nearest Neighbor regression algorithm is as follows:

https://cdn.aibydoing.com/aibydoing/images/document-uid214893labid7506timestamp1546420986145.jpg

Next, you need to implement the K-Nearest Neighbor regression algorithm based on the above illustration and description, and verify it with sample data.

Exercise 14.1

Challenge: Implement the K-Nearest Neighbor regression algorithm based on the above illustration and description.

Specification: Use the Euclidean distance formula for distance calculation, and some code can be referred to the experimental content.

def knn_regression(train_data, train_labels, test_data, k):
    """
    参数:
    train_data -- 训练数据特征 numpy.ndarray.2d
    train_labels -- 训练数据目标 numpy.ndarray.1d
    test_data -- 测试数据特征 numpy.ndarray.2d
    k -- k 值

    返回:
    test_labels -- 测试数据目标 numpy.ndarray.1d
    """

    ### 代码开始 ### (≈ 10 行代码)
    test_labels = None
    ### 代码结束 ###

    return test_labels

Next, we provide a set of test data.

import numpy as np

# 训练样本特征
train_data = np.array(
    [[1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6, 6], [7, 7], [8, 8], [9, 9], [10, 10]]
)
# 训练样本目标值
train_labels = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

Run the test

# 测试样本特征
test_data = np.array([[1.2, 1.3], [3.7, 3.5], [5.5, 6.2], [7.1, 7.9]])
# 测试样本目标值
knn_regression(train_data, train_labels, test_data, k=3)

Expected output

array([2., 4., 6., 7.])

○ Sharethis article link to your social media, blog, forum, etc. More external links will increase the search engine ranking of this site.