Collaborative filtering algorithm based on matrix decomposition

Implementation of CF algorithm based on matrix decomposition (I): LFM

LFM is the Funk SVD matrix decomposition mentioned earlier

Analysis of LFM principle

The core idea of LFM (late factor model) implicit semantic model is to contact users and items through implicit features, as shown in the following figure:

  • P matrix is user lf matrix, that is, user and implicit feature matrix. LF has three, indicating that there are always three implied features.
  • Q matrix is lf item matrix, that is, the matrix of implied features and items
  • R matrix is a user item matrix, which is derived from P*Q
  • Can handle sparse scoring matrix

Using the matrix decomposition technology, the scoring matrix (dense / sparse) of the original user item is decomposed into P and Q matrices, and then P ∗ Q P*Q P * Q restores the user item scoring matrix R R R. The whole process is equivalent to dimensionality reduction, in which:

  • Matrix value P 11 P_{11} P11 ﹤ represents the weight value of user 1 to implicit feature 1

  • Matrix value Q 11 Q_{11} Q11 represents the weight value of implicit feature 1 on Article 1

  • Matrix value R 11 R_{11} R11 represents the predicted score of user 1 on Item 1, and R 11 = P 1 , k ⃗ ⋅ Q k , 1 ⃗ R_{11}=\vec{P_{1,k}}\cdot \vec{Q_{k,1}} R11​=P1,k​ ​⋅Qk,1​ ​

Use LFM to predict users' ratings of items, k k k represents the number of implied features:

Therefore, in the end, our goal is to obtain the P matrix and Q matrix and each value thereof, and then predict the user item score.

loss function

Similarly, for the score prediction, we use the square difference to construct the loss function:

Add L2 regularization:
C o s t = ∑ u , i ∈ R ( r u i − ∑ k = 1 k p u k q i k ) 2 + λ ( ∑ U p u k 2 + ∑ I q i k 2 ) Cost = \sum_{u,i\in R} (r_{ui}-{\sum_{k=1}}^k p_{uk}q_{ik})^2 + \lambda(\sum_U{p_{uk}}^2+\sum_I{q_{ik}}^2) Cost=u,i∈R∑​(rui​−k=1∑​kpuk​qik​)2+λ(U∑​puk​2+I∑​qik​2)
Partial derivative of loss function:

Stochastic gradient descent optimization

Gradient descent update parameters p u k p_{uk} puk​:

Similarly:

Random gradient descent: vector multiplication, each component is multiplied and summed

Because P matrix and Q matrix are two different matrices, they usually adopt different regularization parameters, such as λ 1 \lambda_1 λ 1. And λ 2 \lambda_2 λ2​

Algorithm implementation

'''
LFM Model
'''
import pandas as pd
import numpy as np

# Score prediction 1-5
class LFM(object):

    def __init__(self, alpha, reg_p, reg_q, number_LatentFactors=10, number_epochs=10, columns=["uid", "iid", "rating"]):
        self.alpha = alpha # Learning rate
        self.reg_p = reg_p    # P-matrix regularity
        self.reg_q = reg_q    # Q-matrix regularity
        self.number_LatentFactors = number_LatentFactors  # Number of implicit categories
        self.number_epochs = number_epochs    # Maximum number of iterations
        self.columns = columns

    def fit(self, dataset):
        '''
        fit dataset
        :param dataset: uid, iid, rating
        :return:
        '''

        self.dataset = pd.DataFrame(dataset)

        self.users_ratings = dataset.groupby(self.columns[0]).agg([list])[[self.columns[1], self.columns[2]]]
        self.items_ratings = dataset.groupby(self.columns[1]).agg([list])[[self.columns[0], self.columns[2]]]

        self.globalMean = self.dataset[self.columns[2]].mean()

        self.P, self.Q = self.sgd()

    def _init_matrix(self):
        '''
        initialization P and Q Matrix, and set the random value between 0 and 1 as the initial value
        :return:
        '''
        # User-LF
        P = dict(zip(
            self.users_ratings.index,
            np.random.rand(len(self.users_ratings), self.number_LatentFactors).astype(np.float32)
        ))
        # Item-LF
        Q = dict(zip(
            self.items_ratings.index,
            np.random.rand(len(self.items_ratings), self.number_LatentFactors).astype(np.float32)
        ))
        return P, Q

    def sgd(self):
        '''
        Use random gradient descent to optimize the results
        :return:
        '''
        P, Q = self._init_matrix()

        for i in range(self.number_epochs):
            print("iter%d"%i)
            error_list = []
            for uid, iid, r_ui in self.dataset.itertuples(index=False):
                # User-LF P
                ## Item-LF Q
                v_pu = P[uid] #User vector
                v_qi = Q[iid] #Item vector
                err = np.float32(r_ui - np.dot(v_pu, v_qi))

                v_pu += self.alpha * (err * v_qi - self.reg_p * v_pu)
                v_qi += self.alpha * (err * v_pu - self.reg_q * v_qi)
                
                P[uid] = v_pu 
                Q[iid] = v_qi

                # for k in range(self.number_of_LatentFactors):
                #     v_pu[k] += self.alpha*(err*v_qi[k] - self.reg_p*v_pu[k])
                #     v_qi[k] += self.alpha*(err*v_pu[k] - self.reg_q*v_qi[k])

                error_list.append(err ** 2)
            print(np.sqrt(np.mean(error_list)))
        return P, Q

    def predict(self, uid, iid):
        # If uid or iid is not available, we use the average score of the whole play as the prediction result
        if uid not in self.users_ratings.index or iid not in self.items_ratings.index:
            return self.globalMean

        p_u = self.P[uid]
        q_i = self.Q[iid]

        return np.dot(p_u, q_i)

    def test(self,testset):
        '''Predictive test set data'''
        for uid, iid, real_rating in testset.itertuples(index=False):
            try:
                pred_rating = self.predict(uid, iid)
            except Exception as e:
                print(e)
            else:
                yield uid, iid, real_rating, pred_rating

if __name__ == '__main__':
    dtype = [("userId", np.int32), ("movieId", np.int32), ("rating", np.float32)]
    dataset = pd.read_csv("datasets/ml-latest-small/ratings.csv", usecols=range(3), dtype=dict(dtype))

    lfm = LFM(0.02, 0.01, 0.01, 10, 100, ["userId", "movieId", "rating"])
    lfm.fit(dataset)

    while True:
        uid = input("uid: ")
        iid = input("iid: ")
        print(lfm.predict(int(uid), int(iid)))

Implementation of CF algorithm based on matrix decomposition (II): BiasSvd

BiasSvd is actually the offset term added to the previously mentioned Funk SVD matrix decomposition.

BiasSvd

Using BiasSvd to predict users' ratings of items, k k k represents the number of implied features:

loss function

Similarly, for the score prediction, we use the square difference to construct the loss function:

Add L2 regularization:
C o s t = ∑ u , i ∈ R ( r u i − μ − b u − b i − ∑ k = 1 k p u k q i k ) 2 + λ ( ∑ U b u 2 + ∑ I b i 2 + ∑ U p u k 2 + ∑ I q i k 2 ) Cost = \sum_{u,i\in R} (r_{ui}-\mu - b_u - b_i-{\sum_{k=1}}^k p_{uk}q_{ik})^2 + \lambda(\sum_U{b_u}^2+\sum_I{b_i}^2+\sum_U{p_{uk}}^2+\sum_I{q_{ik}}^2) Cost=u,i∈R∑​(rui​−μ−bu​−bi​−k=1∑​kpuk​qik​)2+λ(U∑​bu​2+I∑​bi​2+U∑​puk​2+I∑​qik​2)
Partial derivative of loss function:

Stochastic gradient descent optimization

Gradient descent update parameters p u k p_{uk} puk​:

Similarly:

b u : = b u + α [ ∑ u , i ∈ R ( r u i − μ − b u − b i − ∑ k = 1 k p u k q i k ) − λ b u ] b_u:=b_u + \alpha[\sum_{u,i\in R} (r_{ui}-\mu - b_u - b_i-{\sum_{k=1}}^k p_{uk}q_{ik}) - \lambda b_u] bu​:=bu​+α[u,i∈R∑​(rui​−μ−bu​−bi​−k=1∑​kpuk​qik​)−λbu​]

b i : = b i + α [ ∑ u , i ∈ R ( r u i − μ − b u − b i − ∑ k = 1 k p u k q i k ) − λ b i ] b_i:=b_i + \alpha[\sum_{u,i\in R} (r_{ui}-\mu - b_u - b_i-{\sum_{k=1}}^k p_{uk}q_{ik}) - \lambda b_i] bi​:=bi​+α[u,i∈R∑​(rui​−μ−bu​−bi​−k=1∑​kpuk​qik​)−λbi​]

Random gradient descent:

b u : = b u + α [ ( r u i − μ − b u − b i − ∑ k = 1 k p u k q i k ) − λ 3 b u ] b_u:=b_u + \alpha[(r_{ui}-\mu - b_u - b_i-{\sum_{k=1}}^k p_{uk}q_{ik}) - \lambda_3 b_u] bu​:=bu​+α[(rui​−μ−bu​−bi​−k=1∑​kpuk​qik​)−λ3​bu​]

b i : = b i + α [ ( r u i − μ − b u − b i − ∑ k = 1 k p u k q i k ) − λ 4 b i ] b_i:=b_i + \alpha[(r_{ui}-\mu - b_u - b_i-{\sum_{k=1}}^k p_{uk}q_{ik}) - \lambda_4 b_i] bi​:=bi​+α[(rui​−μ−bu​−bi​−k=1∑​kpuk​qik​)−λ4​bi​]

Because P matrix and Q matrix are two different matrices, they usually adopt different regularization parameters, such as λ 1 \lambda_1 λ 1. And λ 2 \lambda_2 λ2​

Algorithm implementation

'''
BiasSvd Model
'''
import math
import random
import pandas as pd
import numpy as np

class BiasSvd(object):

    def __init__(self, alpha, reg_p, reg_q, reg_bu, reg_bi, number_LatentFactors=10, number_epochs=10, columns=["uid", "iid", "rating"]):
        self.alpha = alpha # Learning rate
        self.reg_p = reg_p
        self.reg_q = reg_q
        self.reg_bu = reg_bu
        self.reg_bi = reg_bi
        self.number_LatentFactors = number_LatentFactors  # Number of implicit categories
        self.number_epochs = number_epochs
        self.columns = columns

    def fit(self, dataset):
        '''
        fit dataset
        :param dataset: uid, iid, rating
        :return:
        '''

        self.dataset = pd.DataFrame(dataset)

        self.users_ratings = dataset.groupby(self.columns[0]).agg([list])[[self.columns[1], self.columns[2]]]
        self.items_ratings = dataset.groupby(self.columns[1]).agg([list])[[self.columns[0], self.columns[2]]]
        self.globalMean = self.dataset[self.columns[2]].mean()

        self.P, self.Q, self.bu, self.bi = self.sgd()

    def _init_matrix(self):
        '''
        initialization P and Q Matrix, and set the random value between 0 and 1 as the initial value
        :return:
        '''
        # User-LF
        P = dict(zip(
            self.users_ratings.index,
            np.random.rand(len(self.users_ratings), self.number_LatentFactors).astype(np.float32)
        ))
        # Item-LF
        Q = dict(zip(
            self.items_ratings.index,
            np.random.rand(len(self.items_ratings), self.number_LatentFactors).astype(np.float32)
        ))
        return P, Q

    def sgd(self):
        '''
        Use random gradient descent to optimize the results
        :return:
        '''
        P, Q = self._init_matrix()

        # Initialize the values of bu and bi, and set them all to 0
        bu = dict(zip(self.users_ratings.index, np.zeros(len(self.users_ratings))))
        bi = dict(zip(self.items_ratings.index, np.zeros(len(self.items_ratings))))

        for i in range(self.number_epochs):
            print("iter%d"%i)
            error_list = []
            for uid, iid, r_ui in self.dataset.itertuples(index=False):
                v_pu = P[uid]
                v_qi = Q[iid]
                err = np.float32(r_ui - self.globalMean - bu[uid] - bi[iid] - np.dot(v_pu, v_qi))

                v_pu += self.alpha * (err * v_qi - self.reg_p * v_pu)
                v_qi += self.alpha * (err * v_pu - self.reg_q * v_qi)
                
                P[uid] = v_pu 
                Q[iid] = v_qi
                
                bu[uid] += self.alpha * (err - self.reg_bu * bu[uid])
                bi[iid] += self.alpha * (err - self.reg_bi * bi[iid])

                error_list.append(err ** 2)
            print(np.sqrt(np.mean(error_list)))

        return P, Q, bu, bi

    def predict(self, uid, iid):

        if uid not in self.users_ratings.index or iid not in self.items_ratings.index:
            return self.globalMean

        p_u = self.P[uid]
        q_i = self.Q[iid]

        return self.globalMean + self.bu[uid] + self.bi[iid] + np.dot(p_u, q_i)


if __name__ == '__main__':
    dtype = [("userId", np.int32), ("movieId", np.int32), ("rating", np.float32)]
    dataset = pd.read_csv("datasets/ml-latest-small/ratings.csv", usecols=range(3), dtype=dict(dtype))

    bsvd = BiasSvd(0.02, 0.01, 0.01, 0.01, 0.01, 10, 20)
    bsvd.fit(dataset)

    while True:
        uid = input("uid: ")
        iid = input("iid: ")
        print(bsvd.predict(int(uid), int(iid)))

come on.

thank!

strive!

Tags: Algorithm linear algebra

Posted on Thu, 04 Nov 2021 05:41:34 -0400 by realchamp