Find Common Friends - Data Mining - Scala Edition

Hello, there are many language implementations on the Internet about the algorithm of "Find common friends". When I have time today, I have studied the writing of the Scala algorithm myself.

The complete code can refer to the Git address:https://github.com/benben7466/SparkDemo/blob/master/spark-test/src/main/scala/testCommendFriend.scala

Data entered:

A:B,C,D,F,E,O B:A,C,E,K C:F,A,D,I D:A,E,F,L E:B,C,D,M,L F:A,B,C,D,E,O,M G:A,C,D,E,F H:A,C,D,E,O I:A,O J:B,O K:A,C,D L:D,E,F M:E,F,G O:A,H,I,J

Core algorithm:

1 package chunbo.recommend 2 3 import org.apache.spark.SparkContext 4 5 //Common Friend Statistics 6 //Reference resources:http://www.cnblogs.com/charlesblc/p/6126346.html 7 object testCommendFriend { 8 def index(_spark_sc: SparkContext): Unit = { 9 10 //get data 11 val friendRDD = _spark_sc.textFile(Config.HDFS_HOSH + "test/common_friend") 12 13 //map 14 val friendKV = friendRDD.map(x => { 15 val fields = x.split(":") 16 val person = fields(0) 17 val friends = fields(1).split(",").toList 18 (person, friends) 19 }) 20 21 val mapRDD = friendKV.flatMap(x => { 22 for (i <- 0 until x._2.length) yield (x._2(i), x._1) 23 }) 24 25 //reduce 26 val reduceRDD = mapRDD.reduceByKey(_ + "::" + _) 27 28 //Print 29 reduceRDD.foreach(println) 30 31 } 32 33 }

The output data is as follows:

(L,D::E) (B,A::E::F::J) (J,O) (H,O) (F,A::C::D::G::L::M) (D,A::C::E::F::G::H::K::L) (G,M) (M,E::F) (O,A::F::H::I::J) (A,B::C::D::F::G::H::I::K::O) (I,C::O) (K,B) (C,A::B::E::F::G::H::K) (E,A::B::D::F::G::H::L::M)

Explain:

Separated by commas, the left represents the common friends of the right collection.

Find Common Friends - Data Mining - Scala Edition

4 July 2020, 10:58 | Views: 4446

Add new comment

0 comments