Find Common Friends - Data Mining - Scala Edition

Hello, there are many language implementations on the Internet about the algorithm of "Find common friends". W...

Hello, there are many language implementations on the Internet about the algorithm of "Find common friends". When I have time today, I have studied the writing of the Scala algorithm myself.

The complete code can refer to the Git address:https://github.com/benben7466/SparkDemo/blob/master/spark-test/src/main/scala/testCommendFriend.scala

Data entered:

A:B,C,D,F,E,O B:A,C,E,K C:F,A,D,I D:A,E,F,L E:B,C,D,M,L F:A,B,C,D,E,O,M G:A,C,D,E,F H:A,C,D,E,O I:A,O J:B,O K:A,C,D L:D,E,F M:E,F,G O:A,H,I,J

Core algorithm:

1 package chunbo.recommend 2 3 import org.apache.spark.SparkContext 4 5 //Common Friend Statistics 6 //Reference resources:http://www.cnblogs.com/charlesblc/p/6126346.html 7 object testCommendFriend { 8 def index(_spark_sc: SparkContext): Unit = { 9 10 //get data 11 val friendRDD = _spark_sc.textFile(Config.HDFS_HOSH + "test/common_friend") 12 13 //map 14 val friendKV = friendRDD.map(x => { 15 val fields = x.split(":") 16 val person = fields(0) 17 val friends = fields(1).split(",").toList 18 (person, friends) 19 }) 20 21 val mapRDD = friendKV.flatMap(x => { 22 for (i <- 0 until x._2.length) yield (x._2(i), x._1) 23 }) 24 25 //reduce 26 val reduceRDD = mapRDD.reduceByKey(_ + "::" + _) 27 28 //Print 29 reduceRDD.foreach(println) 30 31 } 32 33 }

The output data is as follows:

(L,D::E) (B,A::E::F::J) (J,O) (H,O) (F,A::C::D::G::L::M) (D,A::C::E::F::G::H::K::L) (G,M) (M,E::F) (O,A::F::H::I::J) (A,B::C::D::F::G::H::I::K::O) (I,C::O) (K,B) (C,A::B::E::F::G::H::K) (E,A::B::D::F::G::H::L::M)

Explain:

Separated by commas, the left represents the common friends of the right collection.

For example (L,D::E), L is the common friend of users D and E.

Reference resources:http://www.cnblogs.com/charlesblc/p/6126346.html

4 July 2020, 10:58 | Views: 4155

Add new comment

For adding a comment, please log in
or create account

0 comments