Spark FPGrowth 零售业务商品关联购买概率数据挖掘

 
package sparkFIM

import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.mllib.fpm.FPGrowth

/**
 * Created by zhangshuai on 2017/4/19.
 */
object test {
     def main (args: Array[String]){
       val conf = new SparkConf().setAppName("FPGrowth").setMaster("local").set("spark.sql.warehouse.dir","E:/1/1.txt")
       val sc= new SparkContext(conf)
       //最小支持度
       val minSupport=0.2
       //最小置信度
       val minConfidence=0.8
       //数据分区
       val numPartitions=2
       //取出数据demo
       val data=sc.textFile("E:/demo.txt")
       //数据分割
       val transactions=data.map(x=>x.split(" "))
       transactions.cache()
       //创建算法实例
       val fpg = new FPGrowth()

       //设置训练最小支持度和数据分区
       fpg.setMinSupport(minSupport)
       fpg.setNumPartitions(numPartitions)
       //数据带入算法中
       val model=fpg.run(transactions)
       //查看所有频繁数据集出现的次数
       model.freqItemsets.collect().foreach(itemset=>{
         println(itemset.items.mkString("[", ",", "]")+", "+itemset.freq)

       })
       //通过筛选推荐规则
       model.generateAssociationRules(minConfidence).collect().foreach{rule=>println(
         rule.antecedent.mkString("[", ",", "]")+ " => " + rule.consequent .mkString("[", ",", "]")+ ", "
           + rule.confidence
       )}

       println(model.generateAssociationRules(minConfidence).collect().length)

    }

}
牛奶 鸡蛋 面包 薯片
鸡蛋 爆米花 薯片 啤酒
鸡蛋 面包 薯片
牛奶 鸡蛋 面包 爆米花 薯片 啤酒
牛奶 面包 啤酒
鸡蛋 面包 啤酒
牛奶 面包 薯片
牛奶 鸡蛋 面包 黄油 薯片
牛奶 鸡蛋 黄油 薯片
//查看所有频繁项集,列出出现次数
[面包], 7
[面包,薯片], 5
[面包,薯片,鸡蛋], 4
[面包,鸡蛋], 5
[啤酒], 4
[啤酒,面包], 3
[啤酒,面包,鸡蛋], 2
[啤酒,薯片], 2
[啤酒,薯片,鸡蛋], 2
[啤酒,牛奶], 2
[啤酒,牛奶,面包], 2
[啤酒,鸡蛋], 3
[鸡蛋], 7
[黄油], 2
[黄油,薯片], 2
[黄油,薯片,鸡蛋], 2
[黄油,牛奶], 2
[黄油,牛奶,薯片], 2
[黄油,牛奶,薯片,鸡蛋], 2
[黄油,牛奶,鸡蛋], 2
[黄油,鸡蛋], 2
[爆米花], 2
[爆米花,啤酒], 2
[爆米花,啤酒,薯片], 2
[爆米花,啤酒,薯片,鸡蛋], 2
[爆米花,啤酒,鸡蛋], 2
[爆米花,薯片], 2
[爆米花,薯片,鸡蛋], 2
[爆米花,鸡蛋], 2
[薯片], 7
[薯片,鸡蛋], 6
[牛奶], 6
[牛奶,面包], 5
[牛奶,面包,薯片], 4
[牛奶,面包,薯片,鸡蛋], 3
[牛奶,面包,鸡蛋], 3
[牛奶,薯片], 5
[牛奶,薯片,鸡蛋], 4
[牛奶,鸡蛋], 4
//推荐结果
[黄油] => [薯片], 1.0
[黄油] => [牛奶], 1.0
[黄油] => [鸡蛋], 1.0
[爆米花,薯片] => [啤酒], 1.0
[爆米花,薯片] => [鸡蛋], 1.0
[薯片] => [鸡蛋], 0.8571428571428571
[牛奶,薯片] => [面包], 0.8
[牛奶,薯片] => [鸡蛋], 0.8
[啤酒,薯片,鸡蛋] => [爆米花], 1.0
[黄油,牛奶,鸡蛋] => [薯片], 1.0
[爆米花,啤酒] => [薯片], 1.0
[爆米花,啤酒] => [鸡蛋], 1.0
[爆米花] => [啤酒], 1.0
[爆米花] => [薯片], 1.0
[爆米花] => [鸡蛋], 1.0
[爆米花,啤酒,薯片] => [鸡蛋], 1.0
[面包,鸡蛋] => [薯片], 0.8
[黄油,牛奶,薯片] => [鸡蛋], 1.0
[爆米花,啤酒,鸡蛋] => [薯片], 1.0
[牛奶,面包] => [薯片], 0.8
[牛奶] => [面包], 0.8333333333333334
[牛奶] => [薯片], 0.8333333333333334
[鸡蛋] => [薯片], 0.8571428571428571
[面包,薯片] => [鸡蛋], 0.8
[面包,薯片] => [牛奶], 0.8
[黄油,薯片] => [鸡蛋], 1.0
[黄油,薯片] => [牛奶], 1.0
[黄油,薯片,鸡蛋] => [牛奶], 1.0
[爆米花,薯片,鸡蛋] => [啤酒], 1.0
[牛奶,鸡蛋] => [薯片], 1.0
[牛奶,面包,鸡蛋] => [薯片], 1.0
[黄油,鸡蛋] => [薯片], 1.0
[黄油,鸡蛋] => [牛奶], 1.0
[啤酒,牛奶] => [面包], 1.0
[啤酒,薯片] => [鸡蛋], 1.0
[啤酒,薯片] => [爆米花], 1.0
[爆米花,鸡蛋] => [啤酒], 1.0
[爆米花,鸡蛋] => [薯片], 1.0
[黄油,牛奶] => [薯片], 1.0
[黄油,牛奶] => [鸡蛋], 1.0

 

Administrator

知人不必言尽,留三分余地与人,留些口德与己。 责人不必苛尽,留三分余地与人,留些肚量与己。 才能不必傲尽,留三分余地与人,留些内涵与己。 锋芒不必露尽,留三分余地与人,留些深敛与己。 有功不必邀尽,留三分余地与人,留些谦让与己。

发表评论

电子邮件地址不会被公开。 必填项已用*标注