gpt4 book ai didi

java - shuffle内存池空闲: SPARK with Java

转载 作者:太空宇宙 更新时间:2023-11-04 06:26:22 24 4
gpt4 key购买 nike

我想通过像这样的键加入两个列表(NoHeaderIndexed 和 NoFirstIndexed):

    final  Broadcast<JavaPairRDD<Long, Tuple2<String, String>>> c = ctx.broadcast(noHeaderIndexed);
JavaPairRDD<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>> rs = noFirstIndexed.mapToPair(new PairFunction<Tuple2<Long, Tuple2<String, String>>, Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>>() {
@Override
public Tuple2<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>> call(Tuple2<Long, Tuple2<String, String>> longTuple2Tuple2) throws Exception {
String s1 = "";
if (c.value().lookup(longTuple2Tuple2._1).get(0)._1 != null)
s1 = c.value().lookup(longTuple2Tuple2._1).get(0)._1;

String s2 = "";
if (c.value().lookup(longTuple2Tuple2._1).get(0)._2 != null)
s2 = c.value().lookup(longTuple2Tuple2._1).get(0)._2;
return new Tuple2<Tuple2<Tuple2<String, String>, Long>, Tuple2<Tuple2<String, String>, Long>>(new Tuple2<Tuple2<String, String>, Long>(new Tuple2<String, String>(longTuple2Tuple2._2._1,longTuple2Tuple2._2._2),longTuple2Tuple2._1),new Tuple2<Tuple2<String, String>, Long>(new Tuple2<String, String>(s1,s2),longTuple2Tuple2._1 ) );
}
});

//writeResult(rs, "rs.txt");
rs.coalesce(1,true).saveAsTextFile(path+ "rs");

但是当我尝试执行它时,它显示:

INFO ShuffleMemoryManager: Thread 61 waiting for at least 1/2N of shuffle memory pool to be free    

并且它不会终止执行。您能否向我解释一下这个问题以及如何解决它。

提前谢谢您。

最佳答案

在此命令中

rs.coalesce(1,true).saveAsTextFile(path+ "rs");

您只创建一个分区,以便所有数据都将到达一个节点。您需要增加分区数量

根据您的数据大小尝试此操作

rs.coalesce(10,true).saveAsTextFile(path+ "rs");

关于java - shuffle内存池空闲: SPARK with Java,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26757945/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com