gpt4 book ai didi

scala - Flink 在 EMR 上写入 S3

转载 作者:可可西里 更新时间:2023-11-01 15:51:52 28 4
gpt4 key购买 nike

我正在尝试使用 EMR 和 Flink 将一些输出写入 S3。我正在使用 Scala 2.11.7、Flink 1.3.2 和 EMR 5.11。但是,我收到以下错误:

java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.addResource(Lorg/apache/hadoop/conf/Configuration;)V
at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.initialize(EmrFileSystem.java:93)
at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.initialize(HadoopFileSystem.java:345)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:350)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:389)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
at org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:222)
at org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
at org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:61)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)

我的 build.sbt 看起来像这样:

libraryDependencies ++= Seq(
"org.apache.flink" % "flink-core" % "1.3.2",
"org.apache.flink" % "flink-scala_2.11" % "1.3.2",
"org.apache.flink" % "flink-streaming-scala_2.11" % "1.3.2",
"org.apache.flink" % "flink-shaded-hadoop2" % "1.3.2",
"org.apache.flink" % "flink-clients_2.11" % "1.3.2",
"org.apache.flink" %% "flink-avro" % "1.3.2",
"org.apache.flink" %% "flink-connector-filesystem" % "1.3.2"
)

我也找到了这篇文章,但没有解决问题:External checkpoints to S3 on EMR

我只是将输出放到 S3 中:input.writeAsText("s3://test/flink")。如有任何建议,我们将不胜感激。

最佳答案

不确定 flink-shaded-hadoop 和 EMR 版本的良好组合。经过几轮尝试和失败后,我能够使用新版本的 flink-shaded-hadoop2 写入 S3 -- "org.apache.flink"% "flink-shaded -hadoop2"% "1.4.0"

关于scala - Flink 在 EMR 上写入 S3,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48388074/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com