Machine Learning Notes: Non-negative Matrix Decomposition Problem NMF

Introduction to 1 NMF

NMF(Non-negative matrix factorization), that is, for any given non-negative matrix V, it can find a non-negative matrix W and a non-negative matrix H, satisfying the condition V=W*H, thus decomposing a non-negative matrix into the product of left and right two non-negative matrices.

In this case, each column of the V matrix represents the information of an observation point, and each row represents a feature; the W matrix is called the base matrix, and the H matrix is called the coefficient matrix or the weight matrix. Each column of the V matrix corresponds to the weighted sum of each column of the base matrix W for each column of the H matrix.

By replacing the original matrix with the coefficient matrix H, you can reduce the dimension of the original matrix and get the reduced dimension matrix of the data characteristics, thus reducing the storage space. (Each column vector of H can be viewed as the column vector corresponding to matrix V, and projected to the coordinates of each column vector of the base matrix W.)

NMF is essentially a method of matrix decomposition, which is characterized by the ability to decompose a large non-negative matrix into two small non-negative matrices, and because the decomposed matrix is also non-negative, it can continue to decompose.

The key to non-negative matrix decomposition is "non-negative", that is, the original data and the new base must be non-negative, or in the "first quadrant", so that the values projected by the original data on the new base will naturally be non-negative as well.

2 Defining NMF in Mathematical Language

Converting a matrix decomposition problem to one that minimizes errors between two matrices

Iterative Formulas for 3W and H

The iteration method is used to approach the final result step by step, and the decomposition is successful when the two matrices W and H converge.

It is important to note that the product of the original matrix and the decomposed two matrices is not required to be exactly equal, and there may be some error.

Loss function of 4 NMF

4.0 naive form

When expressed in a matrix, the following are true:

4.1 squared frobenius norm


4.2 KL divergence

X,Y are the product of the original matrix and WH, respectively

4.3  Itakura-Saito (IS)

Examples of 5 NMF applications

5.1: Text Theme Model

Suppose we enter m words and N text. Aij corresponds to the eigenvalue of the first word i n the jth text.

After NMF decomposition, Wik corresponds to the probability correlation of the first word and the k "topic"; Hkj corresponds to the probability correlation of the j first text and the k "topic"

5.2 Image Processing

Insufficient 6 NMF

As a beautiful matrix decomposition method, NMF can be used well in thematic models and give explanatory results based on probability distributions.

However, NMF can only recognize the text in the training samples for thematic purposes, while text recognition that is not in the samples may not be accurate.

Non-negative Matrix Factorization (NMF) of Text Theme Model - Pinard Liu - Blog Park ( 

7 NMF Implementation (sklearn)

#Import Library
from sklearn.decomposition import NMF
import numpy as np

X = np.array([[1, 1], 
              [2, 1], 
              [3, 1.2], 
              [4, 1], 
              [5, 0.8], 
              [6, 1]])

#Define NMF Model
model = NMF(
    n_components=2, #The size of k in the decomposed dense matrix
    # {'frobenius', 'kullback-leibler', 'itakura-saito'}
    #Corresponds to the 1~3 loss functions mentioned above
    # In general, naive's loss function is used by default ('frobenius', with alpha defaulting to 0)
    tol=1e-4,  # Limit conditions for stopping iteration
    init='random',# Initialization Method of W H
    max_iter=200,  # Maximum number of iterations
    l1_ratio=0.,  # L1 Regularized Scale
    alpha=0.,  # Regularization parameters

#Print model constructor parameters
{'alpha': 0.0, 'beta_loss': 'frobenius', 'init': 'random', 'l1_ratio': 0.0, 'max_iter': 200, 'n_components': 2, 'random_state': 0, 'shuffle': False, 'solver': 'cd', 'tol': 0.0001, 'verbose': 0}

W = model.fit_transform(X)
#Equivalent to and W=model.transform(X)
H = model.components_

[[0.         0.46880684]
 [0.55699523 0.3894146 ]
 [1.00331638 0.41925352]
 [1.6733999  0.22926926]
 [2.34349311 0.03927954]
 [2.78981512 0.06911798]]
 [[2.09783018 0.30560234]
 [2.13443044 2.13171694]]


Tags: Algorithm Machine Learning linear algebra

Posted on Sat, 11 Sep 2021 14:01:27 -0400 by neuroxik