Linear discriminant criterion and linear classification programming practice

Linear criterion (LDA)

LDA is a dimension reduction technology of supervised learning. In other words, each sample of its dataset has category output, which is different from PCA. PCA is an unsupervised dimensionality reduction technique without considering the output of sample categories. The idea of LDA can be summarized in one sentence, that is, "after projection, the intra class variance is the smallest and the inter class variance is the largest".

LDA algorithm can be used for both dimensionality reduction and classification, but at present, it is mainly used for dimensionality reduction. LDA is a powerful tool for data analysis related to image recognition.
advantage

  • 1) In the process of dimensionality reduction, category prior knowledge experience can be used, while unsupervised learning such as PCA can not use category prior knowledge.

  • 2) LDA is better than PCA when the sample classification information depends on the mean rather than variance.

shortcoming

  • 1) LDA is not suitable for dimensionality reduction of non Gaussian distribution samples, and PCA also has this problem.

  • 2) The dimension reduction of LDA can be reduced to the dimension of category number k-1 at most. If the dimension reduction is greater than k-1, LDA cannot be used. Of course, there are some evolutionary algorithms of LDA that can bypass this problem.

  • 3) When the sample classification information depends on variance rather than mean, the dimensionality reduction effect of LDA is not good.

  • 4) LDA may over fit the data.

Linear classification algorithm

support vector machines (SVM) is a binary classification model. Its purpose is to find a hyperplane to segment the samples. The principle of segmentation is to maximize the interval, which is finally transformed into a convex quadratic programming problem. Models from simple to complex include:

  • When the training samples are linearly separable, a linearly separable support vector machine is learned by maximizing the hard interval;
  • When the training samples are approximately linearly separable, a linear support vector machine is learned by maximizing the soft interval;
  • When the training samples are linearly inseparable, a nonlinear support vector machine is learned through kernel technique and soft interval maximization;

Implementation of linear discriminant analysis with Sklearn Library

Dataset, package import

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda#Import LDA Algorithm
from sklearn.datasets._samples_generator import make_classification #Import classification builder
import matplotlib.pyplot as plt #Import tools for drawing
import numpy as np
import pandas as pd

Generate random number

x,y=make_classification(n_samples=200,n_features=2,n_redundant=0,n_classes=2,n_informative=1,n_clusters_per_class=1,class_sep=0.5,random_state=100)
"""
n_features :Number of features= n_informative() + n_redundant + n_repeated
n_informative: Number of multi information features
n_redundant: Redundant information, informative Random linear combination of features
n_repeated : Duplicate information, random extraction n_informative and n_redundant features
n_classes: Classification category
n_clusters_per_class : A category consists of several cluster Constitutive

"""
plt.scatter(x[:,0],x[:,1], marker='o', c=y)
plt.show()
x_train=x[:60, :60]
y_train=y[:60]
x_test=x[40:, :]
y_test=y[40:]

Data set grouping

#It is divided into training set and test set for model training and testing
x_train=x[:150, :150]
y_train=y[:150]
x_test=x[50:, :]
y_test=y[50:]
lda_test=lda()
lda_test.fit(x_train,y_train)
predict_y=lda_test.predict(x_test)#Get predicted results
count=0
for i in range(len(predict_y)):
    if predict_y[i]==y_test[i]:
        count+=1
print("The number of accurate forecasts is"+str(count))
print("The accuracy is"+str(count/len(predict_y)))

SVM classification of lunar dataset

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_moons

# Import dataset
X,y = make_moons(n_samples=200,random_state=0,noise=0.05)

h = .02  # Step size in mesh

# Create an instance of support vector machine and fit the data
C = 1.0  # SVM regularization parameters
svc = svm.SVC(kernel='linear', C=C).fit(X, y) # Linear kernel
rbf_svc = svm.SVC(kernel='rbf', gamma=0.7, C=C).fit(X, y) # Radial basis kernel
poly_svc = svm.SVC(kernel='poly', degree=3, C=C).fit(X, y) # Polynomial kernel
lin_svc = svm.LinearSVC(C=C).fit(X, y) #Linear kernel

# Create a mesh to draw an image
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))

# Title of the diagram
titles = ['SVC with linear kernel',
          'LinearSVC (linear kernel)',
          'SVC with RBF kernel',
          'SVC with polynomial (degree 3) kernel']


for i, clf in enumerate((svc, lin_svc, rbf_svc, poly_svc)):
    # Draw the decision boundary and assign different colors to different areas
    plt.subplot(2, 2, i + 1) # Create a graph with 2 rows and 2 columns, and take the ith graph as the current graph
    plt.subplots_adjust(wspace=0.4, hspace=0.4) # Set subgraph interval

    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) #The elements in xx and yy form a pair of coordinates as the input of support vector machine and return an array
    
    # Draw the classification results
    Z = Z.reshape(xx.shape) #(220, 280)
    plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8) #Use the contour function to draw different areas

    # The training data are drawn in the form of discrete points
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
    plt.xlabel('Sepal length')
    plt.ylabel('Sepal width')
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())
    plt.xticks(())
    plt.yticks(())
    plt.title(titles[i])

plt.show()


reference material

Tags: Machine Learning

Posted on Fri, 05 Nov 2021 00:29:48 -0400 by tranzparency