Machine learning KNN algorithm

Python machine learning

3-day quick start python machine learning in 2018 [dark horse programmer]

Machine learning (4) KNN algorithm

KNN algorithm (also called K-nearest neighbor algorithm)

Here is a small question to explain what KNN algorithm is


Here is a map we have. Now we only know where the five villains are and how far they are from us. Next, we need to use KNN algorithm to infer where we are
We are in the red circle, so it's obvious that the nearest little blue is most likely in our region
Core idea: extrapolate our categories from our neighbors

Definition

If most of the k most similar samples in the feature space belong to a certain category, then the sample also belongs to that category

Second example: Film


**The known training sets are such a few films. They have 6 samples and 2 features. The target values are love films and action films. We need to judge according to the features? What kind of films do they belong to
Figure 2 shows the distance between these six films and? Films. The situation we discussed earlier is that k=1, here
When k = 1, it's a love movie
When k = 2, it's a love movie
When k= 6,? Cannot be determined
**
Therefore, our judgment result has a great relationship with k. if k is too small, it will be affected by the abnormal value
For data, we need to use dimensionless processing, that is, standardized processing of data

Case: prediction of Iris species

1. Import iris data set first
from sklearn.datasets import load_iris
iris = load_iris()
2. Partition data set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=6)
3. Characteristic Engineering: Standardization
from sklearn.preprocessing import StandardScaler
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
4. Create KNN algorithm predictor
from sklearn.neighbors import KNeighborsClassifier
estimator = KNeighborsClassifier(n_neighbors=3) # Take k=3 here.
estimator.fit(x_train, y_train)
5. Model evaluation
5.1 direct comparison of real value and predicted value
y_predict = estimator.predict(x_test)
print('y_predict:\n', y_predict)
print('Direct comparison of real and predicted values:\n', y_test == y_predict)

y_predict:
 [0 2 0 0 2 1 1 0 2 1 2 1 2 2 1 1 2 1 1 0 0 2 0 0 1 1 1 2 0 1 0 1 0 0 1 2 1
 2]
//Direct comparison of real and predicted values:
 [ True  True  True  True  True  True False  True  True  True  True  True
  True  True  True False  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True False  True
  True  True]
5.2 calculation accuracy
score = estimator.score(x_test, y_test)
print('Accuracy rate is:\n', score)
The accuracy is:
 0.9210526315789473
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier


def knn_iris():
    '''
    //Classification of iris by KNN algorithm
    :return:
    '''
    # 1) Get data
    iris = load_iris()
    # 2) Partition data set
    x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=6)
    # 3) Feature Engineering: Standardization
    transfer = StandardScaler()
    x_train = transfer.fit_transform(x_train)
    x_test = transfer.transform(x_test)
    # 4) KNN algorithm predictor
    estimator = KNeighborsClassifier(n_neighbors=3)
    estimator.fit(x_train, y_train)

    # 5) Model evaluation
    # Method 1: direct comparison between the real value and the predicted value
    y_predict = estimator.predict(x_test)
    print('y_predict:\n', y_predict)
    print('Direct comparison of real and predicted values:\n', y_test == y_predict)

    # Method 2: calculation accuracy
    score = estimator.score(x_test, y_test)
    print('Accuracy rate is:\n', score)

    return None


if __name__ == '__main__':
    # Code 1: classification of iris by KNN algorithm
    knn_iris()

83 original articles published, praised 4, 4631 visitors
Private letter follow

Tags: Python

Posted on Mon, 03 Feb 2020 09:28:55 -0500 by MisterWebz