gpt4 book ai didi

scala - org.apache.spark.SparkException : Task not serializable

转载 作者:行者123 更新时间:2023-12-04 13:11:08 35 4
gpt4 key购买 nike

这是一个工作代码示例:

JavaPairDStream<String, String> messages = KafkaUtils.createStream(javaStreamingContext, zkQuorum, group, topicMap);
messages.print();
JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
@Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});

我收到以下错误:
ERROR:
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1435)
at org.apache.spark.streaming.dstream.DStream.map(DStream.scala:438)
at org.apache.spark.streaming.api.java.JavaDStreamLike$class.map(JavaDStreamLike.scala:140)
at org.apache.spark.streaming.api.java.JavaPairDStream.map(JavaPairDStream.scala:46)

最佳答案

由于您使用匿名内部类定义 map 函数,因此包含类也必须是可序列化的。将您的 map 函数定义为单独的类或使其成为静态内部类。来自 Java 文档( http://docs.oracle.com/javase/8/docs/platform/serialization/spec/serial-arch.html ):

Note - Serialization of inner classes (i.e., nested classes that are not static member classes), including local and anonymous classes, is strongly discouraged for several reasons. Because inner classes declared in non-static contexts contain implicit non-transient references to enclosing class instances, serializing such an inner class instance will result in serialization of its associated outer class instance as well.

关于scala - org.apache.spark.SparkException : Task not serializable,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29295838/

35 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com