gpt4 book ai didi

JavaDStream 在 lambda 中打印 RDD 到控制台

转载 作者:搜寻专家 更新时间:2023-11-01 02:59:59 26 4
gpt4 key购买 nike

我是 spark 的新手,我正在尝试创建简单的 JavaDStream 以使用 spark-testing-base API 测试我的工作。到目前为止我所做的是:

    JavaStreamingContext streamingContext = new 
JavaStreamingContext(jsc(),Durations.seconds(10));
List<String> list = new LinkedList<String>();
list.add("first");
list.add("second");
list.add("third");
JavaRDD<String> myVeryOwnRDD = jsc().parallelize(list);
Queue<JavaRDD<String>> queue = new LinkedList<JavaRDD<String>>();
queue.add( myVeryOwnRDD );
JavaDStream<String> javaDStream = streamingContext.queueStream( queue );

javaDStream.foreachRDD( x-> {
x.collect().stream().forEach(n-> System.out.println("item of list: "+n));
});

我希望它能打印我的列表……但没有。我得到了它:

12:19:05.454 [main] DEBUG org.apache.spark.util.ClosureCleaner - +++ Cleaning closure <function1> (org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3) +++
12:19:05.468 [main] DEBUG org.apache.spark.util.ClosureCleaner - + declared fields: 3
12:19:05.469 [main] DEBUG org.apache.spark.util.ClosureCleaner - public static final long org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.serialVersionUID
12:19:05.469 [main] DEBUG org.apache.spark.util.ClosureCleaner - private final org.apache.spark.streaming.api.java.JavaDStreamLike org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.$outer
12:19:05.469 [main] DEBUG org.apache.spark.util.ClosureCleaner - private final org.apache.spark.api.java.function.VoidFunction org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.foreachFunc$3
12:19:05.469 [main] DEBUG org.apache.spark.util.ClosureCleaner - + declared methods: 2
12:19:05.470 [main] DEBUG org.apache.spark.util.ClosureCleaner - public final java.lang.Object org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.apply(java.lang.Object)
12:19:05.470 [main] DEBUG org.apache.spark.util.ClosureCleaner - public final void org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3.apply(org.apache.spark.rdd.RDD)
12:19:05.470 [main] DEBUG org.apache.spark.util.ClosureCleaner - + inner classes: 0
12:19:05.471 [main] DEBUG org.apache.spark.util.ClosureCleaner - + outer classes: 1
12:19:05.472 [main] DEBUG org.apache.spark.util.ClosureCleaner - org.apache.spark.streaming.api.java.JavaDStreamLike
12:19:05.472 [main] DEBUG org.apache.spark.util.ClosureCleaner - + outer objects: 1
12:19:05.473 [main] DEBUG org.apache.spark.util.ClosureCleaner - org.apache.spark.streaming.api.java.JavaDStream@7209ffb5
12:19:05.474 [main] DEBUG org.apache.spark.util.ClosureCleaner - + populating accessed fields because this is the starting closure
12:19:05.478 [main] DEBUG org.apache.spark.util.ClosureCleaner - + fields accessed by starting closure: 1
12:19:05.479 [main] DEBUG org.apache.spark.util.ClosureCleaner - (interface org.apache.spark.streaming.api.java.JavaDStreamLike,Set())
12:19:05.479 [main] DEBUG org.apache.spark.util.ClosureCleaner - + outermost object is not a closure, so do not clone it: (interface org.apache.spark.streaming.api.java.JavaDStreamLike,org.apache.spark.streaming.api.java.JavaDStream@7209ffb5)
12:19:05.480 [main] DEBUG org.apache.spark.util.ClosureCleaner - +++ closure <function1> (org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$3) is now cleaned +++
12:19:05.481 [main] DEBUG org.apache.spark.util.ClosureCleaner - +++ Cleaning closure <function2> (org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3) +++
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + declared fields: 2
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - public static final long org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.serialVersionUID
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - private final scala.Function1 org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.cleanedF$1
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + declared methods: 2
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - public final java.lang.Object org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(java.lang.Object,java.lang.Object)
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - public final void org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(org.apache.spark.rdd.RDD,org.apache.spark.streaming.Time)
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + inner classes: 0
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + outer classes: 0
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + outer objects: 0
12:19:05.482 [main] DEBUG org.apache.spark.util.ClosureCleaner - + populating accessed fields because this is the starting closure
12:19:05.483 [main] DEBUG org.apache.spark.util.ClosureCleaner - + fields accessed by starting closure: 0
12:19:05.483 [main] DEBUG org.apache.spark.util.ClosureCleaner - + there are no enclosing objects!
12:19:05.483 [main] DEBUG org.apache.spark.util.ClosureCleaner - +++ closure <function2> (org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3) is now cleaned +++

我错过了什么吗?PS:给定的输出就在我的打印列表应该在的地方,我正在为我的工作使用 Spring 单元测试:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(classes = config.class)
public class myTester extends SharedJavaSparkContext implements Serializable{

最佳答案

我想您需要启动流上下文。

streamingContext.start()

关于JavaDStream 在 lambda 中打印 RDD 到控制台,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37989007/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com