Linear criterion (LDA)
LDA is a dimension reduction technology of supervised learning. In other words, each sample of its dataset has category output, which is different from PCA. PCA is an unsupervised dimensionality reduction technique without considering the output of sample categories. The idea of LDA can be summarized in one sentence, that is, "after projection, the intra class variance is the smallest and the inter class variance is the largest".
LDA algorithm can be used for both dimensionality reduction and classification, but at present, it is mainly used for dimensionality reduction. LDA is a powerful tool for data analysis related to image recognition.
1) In the process of dimensionality reduction, category prior knowledge experience can be used, while unsupervised learning such as PCA can not use category prior knowledge.
2) LDA is better than PCA when the sample classification information depends on the mean rather than variance.
1) LDA is not suitable for dimensionality reduction of non Gaussian distribution samples, and PCA also has this problem.
2) The dimension reduction of LDA can be reduced to the dimension of category number k-1 at most. If the dimension reduction is greater than k-1, LDA cannot be used. Of course, there are some evolutionary algorithms of LDA that can bypass this problem.
3) When the sample classification information depends on variance rather than mean, the dimensionality reduction effect of LDA is not good.
4) LDA may over fit the data.
Linear classification algorithm
support vector machines (SVM) is a binary classification model. Its purpose is to find a hyperplane to segment the samples. The principle of segmentation is to maximize the interval, which is finally transformed into a convex quadratic programming problem. Models from simple to complex include:
- When the training samples are linearly separable, a linearly separable support vector machine is learned by maximizing the hard interval;
- When the training samples are approximately linearly separable, a linear support vector machine is learned by maximizing the soft interval;
- When the training samples are linearly inseparable, a nonlinear support vector machine is learned through kernel technique and soft interval maximization;
Implementation of linear discriminant analysis with Sklearn Library
Dataset, package import
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda#Import LDA Algorithm from sklearn.datasets._samples_generator import make_classification #Import classification builder import matplotlib.pyplot as plt #Import tools for drawing import numpy as np import pandas as pd
Generate random number
x,y=make_classification(n_samples=200,n_features=2,n_redundant=0,n_classes=2,n_informative=1,n_clusters_per_class=1,class_sep=0.5,random_state=100) """ n_features :Number of features= n_informative() + n_redundant + n_repeated n_informative: Number of multi information features n_redundant: Redundant information, informative Random linear combination of features n_repeated : Duplicate information, random extraction n_informative and n_redundant features n_classes: Classification category n_clusters_per_class : A category consists of several cluster Constitutive """ plt.scatter(x[:,0],x[:,1], marker='o', c=y) plt.show() x_train=x[:60, :60] y_train=y[:60] x_test=x[40:, :] y_test=y[40:]
Data set grouping
#It is divided into training set and test set for model training and testing x_train=x[:150, :150] y_train=y[:150] x_test=x[50:, :] y_test=y[50:] lda_test=lda() lda_test.fit(x_train,y_train) predict_y=lda_test.predict(x_test)#Get predicted results count=0 for i in range(len(predict_y)): if predict_y[i]==y_test[i]: count+=1 print("The number of accurate forecasts is"+str(count)) print("The accuracy is"+str(count/len(predict_y)))
SVM classification of lunar dataset
import numpy as np import matplotlib.pyplot as plt from sklearn import svm from sklearn.datasets import make_moons # Import dataset X,y = make_moons(n_samples=200,random_state=0,noise=0.05) h = .02 # Step size in mesh # Create an instance of support vector machine and fit the data C = 1.0 # SVM regularization parameters svc = svm.SVC(kernel='linear', C=C).fit(X, y) # Linear kernel rbf_svc = svm.SVC(kernel='rbf', gamma=0.7, C=C).fit(X, y) # Radial basis kernel poly_svc = svm.SVC(kernel='poly', degree=3, C=C).fit(X, y) # Polynomial kernel lin_svc = svm.LinearSVC(C=C).fit(X, y) #Linear kernel # Create a mesh to draw an image x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) # Title of the diagram titles = ['SVC with linear kernel', 'LinearSVC (linear kernel)', 'SVC with RBF kernel', 'SVC with polynomial (degree 3) kernel'] for i, clf in enumerate((svc, lin_svc, rbf_svc, poly_svc)): # Draw the decision boundary and assign different colors to different areas plt.subplot(2, 2, i + 1) # Create a graph with 2 rows and 2 columns, and take the ith graph as the current graph plt.subplots_adjust(wspace=0.4, hspace=0.4) # Set subgraph interval Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) #The elements in xx and yy form a pair of coordinates as the input of support vector machine and return an array # Draw the classification results Z = Z.reshape(xx.shape) #(220, 280) plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8) #Use the contour function to draw different areas # The training data are drawn in the form of discrete points plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired) plt.xlabel('Sepal length') plt.ylabel('Sepal width') plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.xticks(()) plt.yticks(()) plt.title(titles[i]) plt.show()