gpt4 book ai didi

google-cloud-dataflow - 如何修复 Dataflow 无法序列化我的 DoFn?

转载 作者:行者123 更新时间:2023-12-04 04:46:15 27 4
gpt4 key购买 nike

当我运行我的 Dataflow 管道时,我收到以下异常,提示我的 DoFn 无法序列化。我该如何解决?

这是堆栈跟踪:

Caused by: java.lang.IllegalArgumentException: unable to serialize contrail.dataflow.AvroMRTransforms$AvroReducerDoFn@bba0fc2
at com.google.cloud.dataflow.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:51)
at com.google.cloud.dataflow.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:81)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.ensureSerializable(DirectPipelineRunner.java:784)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateHelper(ParDo.java:1025)
at com.google.cloud.dataflow.sdk.transforms.ParDo.evaluateSingleHelper(ParDo.java:963)
at com.google.cloud.dataflow.sdk.transforms.ParDo.access$000(ParDo.java:441)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:951)
at com.google.cloud.dataflow.sdk.transforms.ParDo$1.evaluate(ParDo.java:946)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:611)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:200)
at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:196)
at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:109)
at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:204)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:584)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:328)
at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:70)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
at contrail.stages.DataflowStage.stageMain(DataflowStage.java:51)
at contrail.stages.NonMRStage.execute(NonMRStage.java:130)
at contrail.stages.NonMRStage.run(NonMRStage.java:157)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at contrail.stages.ValidateGraphDataflow.main(ValidateGraphDataflow.java:139)
... 6 more
Caused by: java.io.NotSerializableException: org.apache.hadoop.mapred.JobConf
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at com.google.cloud.dataflow.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:47)
... 27 more

最佳答案

补充一下杰里米所说的......

Serializable 问题的另一个常见原因是在非静态上下文中使用匿名 DoFn。匿名内部类有一个指向封闭类的隐式指针,这也会导致它被序列化。

关于google-cloud-dataflow - 如何修复 Dataflow 无法序列化我的 DoFn?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28032063/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com