gpt4 book ai didi

hadoop - 使用 Flume 和 Hadoop 的数据摄取不起作用

转载 作者:可可西里 更新时间:2023-11-01 15:39:01 26 4
gpt4 key购买 nike

我使用的是 Flume 1.4.0 和 Hadoop 2.2.0。当我启动 Flume 并写入 HDFS 时,出现以下异常:

(SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:460)] process failed
java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2442)
at java.lang.Class.privateGetPublicMethods(Class.java:2562)
at java.lang.Class.privateGetPublicMethods(Class.java:2572)
at java.lang.Class.getMethods(Class.java:1427)
at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
at java.lang.reflect.Proxy.getProxyClass(Proxy.java:521)
at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:601)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:207)
at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:53)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:172)
at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:170)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:364)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

我的 hdfs-sink 在 flume.conf 中的部分看起来像这样:

定义一个sink输出到hdfs

agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/flume
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.writeFormat = Text
agent.sinks.hdfs-sink.hdfs.rollCount = 10
agent.sinks.hdfs-sink.hdfs.batchSize = 10
agent.sinks.hdfs-sink.hdfs.rollSize = 0

我希望有人能帮助我。

最佳答案

如果有人遇到同样的问题,这里是解决方案:

通过从 hadoop 复制新的 jar 来替换 flume/lib 目录中所有旧的 jar。

关于hadoop - 使用 Flume 和 Hadoop 的数据摄取不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19997375/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com