gpt4 book ai didi

scala - 如何访问定义位置之外的对象中的累加器?

转载 作者:行者123 更新时间:2023-12-02 03:25:23 26 4
gpt4 key购买 nike

我将 helper map 函数定义为 helper 对象中的单独 def,它不会“看到”前面在代码中定义的累加器。Spark 文档接缝建议将“远程”功能保留在对象内,但我如何使它与这些累加器一起工作?

object mainlogic{
val counter = sc.accumulator(0)
val data = sc.textFile(...)// load logic here
val myrdd = data.mapPartitionsWithIndex(mapFunction)
}

object helper{
def mapFunction(...)={
counter+=1 // not compiling
}
}

最佳答案

像这样的东西需要像任何其他代码一样作为参数传递:

object mainlogic{
val counter = sc.accumulator(0)
val data = sc.textFile(...)// load logic here
val myrdd = data.mapPartitionsWithIndex(mapFunction(counter, _, _))
}

object helper{
def mapFunction(counter: Accumulator[Int], ...)={
counter+=1 // not compiling
}
}

不过请务必记住文档中的注释:

For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed.

关于scala - 如何访问定义位置之外的对象中的累加器?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30632405/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com