gpt4 book ai didi

scala - Spark 、斯卡拉 : How to Subtract the values in the RDD pairs based on their key?

转载 作者:行者123 更新时间:2023-12-03 00:33:37 25 4
gpt4 key购买 nike

我有几个类型为 :RDD[(String, Int)] 的 RDD。我想根据键减去整数值。

这是一个示例:如果输入 RDD 是

Valid_ record  = (TcustomerTDL_2016266,16) 
deleted_record = (TcustomerTDL_2016266,8)

由于键值相同,因此必须减去整数值。我尝试使用“SubtractByKey”,但它似乎不起作用。因此预期结果是 (TcustomerTDL_2016266,8),即 16-8 = 8。`

我使用了以下代码:

val changes_total = valid_record.subtractByKey(deleted_record).

请告诉我是否有其他方法可以做到这一点或者这是否不正确。

这是代码:

val Conf = new SparkConf().setAppName("Module").setMaster("local")
val sc = new SparkContext(Conf)
val incoming_file =sc.wholeTextFiles("D:/Users/Documents/siva_hourly") //changed code
val output = incoming_file.map{case(k,v) => (k.split("/")(6),v.split("\\r?\\n"))}
output.cache()
val change_type = output.map{case (k,v) => (k,(v.toList.map( x => x.split("\001")(2))))} //changed code
val change_delete_count = change_type.map{case(k,v) => (k,(v.filter{ x => x == "D" }).length)}
val change_record_foreach4 = change_delete_count.map{case(k,v) => (k.split("_"),v)}
val change_record_foreach3 = change_record_foreach4.map{case(k,v)=>(k(0)+'_'+k(1),v)}
val change_valid_count = change_type.map{case(k,v) => (k,(v.filter{ x => x =="A" || x == "I"}).length)}
val change_record_foreach = change_valid_count.map{case(k,v) => (k.split("_"),v)}
val change_record_foreach1 = change_record_foreach.map{case(k,v)=>(k(0)+'_'+k(1),v)}
val valid_record = change_record_foreach1.reduceByKey((x, y) => x + y)
val deleted_record = change_record_foreach3.reduceByKey((x, y) => x + y)
val changes_total = valid_record.subtractByKey(deleted_record)

最佳答案

这不是subtractByKey的正确用法

这里是subtractByKey如何工作的示例

假设您有两个 RDD,如下所示。

two pair RDDs (rdd = {(1, 2), (3, 4), (3, 6)} other = {(3, 9)})

rdd.subtractByKey(other)

结果如下

{(1, 2)}

你可以这样做

val joinRDD = Valid_ record .join(deleted_record)
val resultRDD = joinRDD.mapValues(x => x._1 - x._2)

关于scala - Spark 、斯卡拉 : How to Subtract the values in the RDD pairs based on their key?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39824873/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com