gpt4 book ai didi

apache-spark - Spark MLlib-协同过滤隐式Feed

转载 作者:行者123 更新时间:2023-12-04 04:25:31 24 4
gpt4 key购买 nike

因此,我正在使用Spark 1.0.0构建隐式反馈推荐器模型,并尝试遵循他们在协作过滤页面上的示例:
http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#explicit-vs-implicit-feedback

而且我什至加载了测试数据集,它们在示例中引用了这些数据集:
http://codesearch.ruethschilling.info/xref/apache-foundation/spark/mllib/data/als/test.data

但是,当我尝试运行隐式反馈模型时:
val alpha = 0.01
val模型= ALS.trainImplicit(等级,等级,数字,alpha)

(评分是准确地来自其数据集的评分,等级= 10,numIterations = 20)我遇到以下错误:

scala> val model = ALS.trainImplicit(ratings, rank, numIterations, alpha)
<console>:26: error: overloaded method value trainImplicit with alternatives:
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and>
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and>
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and>
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double,seed: Long)org.apache.spark.mllib.recommendation.MatrixFactorizationModel
cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating], Int, Int, Double)
val model = ALS.trainImplicit(ratings, rank, numIterations, alpha)

有趣的是,当不进行trainImplicit(即ALS.train)时,此模型运行良好

最佳答案

该示例似乎与实现不同步,因为没有四个参数的trainImplicit重载-这就是错误消息告诉您的内容。但是,如果您查看Scala source code for ALS,您会发现三参数重载是通过一些“魔数(Magic Number)”以六参数重载实现的:

def trainImplicit(ratings: RDD[Rating], rank: Int, iterations: Int)
: MatrixFactorizationModel = {
trainImplicit(ratings, rank, iterations, 0.01, -1, 1.0)
}

这表明0.01是lambda的一个不错的默认值。 (也许与对ML有更深了解的人一起检查是一件好事。)这可能会给您足够的信息,以合理地调用五或六个参数重载。 (当然,如果您足够了解更好的值(value),那就太好了!)

例如:
val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, alpha)

或者
val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, -1, alpha)

最后,您可能没有意识到有相当不错的 API documentaiton for ALS

关于apache-spark - Spark MLlib-协同过滤隐式Feed,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25649454/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com