gpt4 book ai didi

java - 如何通过左键和右值组合两个 JavaPairRDD

转载 作者:行者123 更新时间:2023-11-30 07:00:45 25 4
gpt4 key购买 nike

JavaPairRDD 一个:

[(A, 0), (B, 0), (C, 0), (D, 0), (E, 0)... ]

JavaPairRDD 二:

[(B, 1), (C, 5), (D, 21)]

输出应该是:

[(A, 0), (B, 1), (C, 5), (D, 21), (E, 0)... ]

要根据第一个 rdd 的键使用第二个 rdd 的值,尝试了 aggregateByKeyunionjoin(左或右)方法,但没有用。

JavaPairRDD<String, Object> currentRdd = firstRdd.fullOuterJoin(secondRdd).map(stringTuple2Tuple2 -> new Tuple2<String, Long>(stringTuple2Tuple2._1(), stringTuple2Tuple2._2()._2().get()));

我怎样才能像这样组合两个 JavaPairRDD?

最佳答案

如果你希望firstRdd中的所有键都出现在最终结果中,或者你只是不关心只出现在secondRdd中的键,你应该使用leftOuterJoin 而不是 fullOuterJoin

leftOuterJoin 解释:

Perform a left outer join of this and other. For each element (k, v) in this, the resulting RDD will either contain all pairs (k, (v, Some(w))) for w in other, or the pair (k, (v, None)) if no elements in other have key k.

Scala 版本:

val left = sc.parallelize(Array(("A", 0), ("B", 0), ("C", 0),("D", 0),("E", 0)))
val right = sc.parallelize(Array(("B", 1), ("C", 5), ("D", 21)))
val lojoin: RDD[(String, (Int, Option[Int]))] = left.leftOuterJoin(right)
val target = lojoin.mapValues(p => p._2.getOrElse(p._1))
target.foreach(println)

Java 版本:

List<Tuple2<String, Integer>> left = new ArrayList<Tuple2<String, Integer>>();
left.add(new Tuple2<String, Integer>("A", 0));
left.add(new Tuple2<String, Integer>("B", 0));
left.add(new Tuple2<String, Integer>("C", 0));
left.add(new Tuple2<String, Integer>("D", 0));
left.add(new Tuple2<String, Integer>("E", 0));

List<Tuple2<String, Integer>> right = new ArrayList<Tuple2<String, Integer>>();
right.add(new Tuple2<String, Integer>("B", 1));
right.add(new Tuple2<String, Integer>("C", 5));
right.add(new Tuple2<String, Integer>("D", 21));

JavaPairRDD<String, Integer> leftRdd = sc.parallelizePairs(left);
JavaPairRDD<String, Integer> rightRdd = sc.parallelizePairs(right);

JavaPairRDD<String, Tuple2<Integer, Optional<Integer>>> lojRdd = leftRdd.leftOuterJoin(rightRdd);

JavaPairRDD<String, Integer> result = lojRdd.mapValues(new Function<Tuple2<Integer, Optional<Integer>>, Integer>() {
@Override
public Integer call(Tuple2<Integer, Optional<Integer>> v1) throws Exception {
return v1._2().or(v1._1());
}
});

result.foreach(new VoidFunction<Tuple2<String, Integer>>() {
@Override
public void call(Tuple2<String, Integer> t) throws Exception {
System.out.println(t._1() + " " + t._2());
}
});

关于java - 如何通过左键和右值组合两个 JavaPairRDD,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30293267/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com