gpt4 book ai didi

scala - Spark : Average of values instead of sum in reduceByKey using Scala

转载 作者:行者123 更新时间:2023-12-03 00:47:28 30 4
gpt4 key购买 nike

当调用reduceByKey时,它会将具有相同键的所有值相加。有没有办法计算每个键的平均值?

// I calculate the sum like this and don't know how to calculate the avg
reduceByKey((x,y)=>(x+y)).collect


Array(((Type1,1),4.0), ((Type1,1),9.2), ((Type1,2),8), ((Type1,2),4.5), ((Type1,3),3.5),
((Type1,3),5.0), ((Type2,1),4.6), ((Type2,1),4), ((Type2,1),10), ((Type2,1),4.3))

最佳答案

一种方法是使用mapValues和reduceByKey,这比aggregateByKey更容易。

.mapValues(value => (value, 1)) // map entry with a count of 1
.reduceByKey {
case ((sumL, countL), (sumR, countR)) =>
(sumL + sumR, countL + countR)
}
.mapValues {
case (sum , count) => sum / count
}
.collect

https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html

关于scala - Spark : Average of values instead of sum in reduceByKey using Scala,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40087483/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com