flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字-6ren

flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字

转载作者：行者123 更新时间：2023-12-03 06:37:30

我正在尝试使用 MemChannel 和 HDFS 处理一些 Twitter 关键字。但是，在控制台上的 HDFS 启动 状态后，flume-ng 没有显示进一步的进度。

这是/etc/flume-ns/conf/flume-env.sh文件内容。

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#     http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced during Flume startup.
# Environment variables can be set here.

export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera

# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
# export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"
# Note that the Flume conf directory is always included in the classpath.
#FLUME_CLASSPATH=""

这是 twitter 配置文件内容。

TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS

#TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel

TwitterAgent.sources.Twitter.consumerKey = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.consumerSecret = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessToken = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientist, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing

TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://uat.cloudera:8020/user/root/flume/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000

TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100

我正在 centOs 控制台上运行以下命令。

flume-ng agent -c /etc/flume-ng/conf -f /etc/flume-ng/conf/twitter.conf -n TwitterAgent -Dflume.root.logger=INFO,console

当我运行命令时，这里是输出。

Info: Sourcing environment configuration script /etc/flume-ng/conf/flume-env.sh
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12.jar from classpath
+ exec /usr/java/jdk1.7.0_67-cloudera/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp r/lib/flume-ng/../search/lib/xmlbeans-2.3.0.jar:/usr/lib/flume-ng/../search/lib/xmlenc-0.52.jar:/usr/lib/flume-ng/../search/lib/xmpcore-5.1.2.jar:/usr/lib/flume-ng/../search/lib/xz-1.0.jar:/usr/lib/flume-ng/../search/lib/zookeeper.jar' -Djava.library.path=:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native org.apache.flume.node.Application -f /etc/flume-ng/conf/farrukh.conf -n TwitterAgent
2015-09-24 12:05:38,876 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2015-09-24 12:05:38,885 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:/etc/flume-ng/conf/farrukh.conf
2015-09-24 12:05:38,896 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,896 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:931)] Added sinks: HDFS Agent: TwitterAgent
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,898 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,911 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:508)] Agent configuration for 'TwitterAgent' has no sources.
2015-09-24 12:05:38,919 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:141)] Post-validation flume configuration contains configuration for agents: [TwitterAgent]
2015-09-24 12:05:38,920 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:145)] Creating channels
2015-09-24 12:05:38,939 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel MemChannel type memory
2015-09-24 12:05:38,957 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:200)] Created channel MemChannel
2015-09-24 12:05:38,963 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: HDFS, type: hdfs
2015-09-24 12:05:40,019 (conf-file-poller-0) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:559)] Hadoop Security enabled: false
2015-09-24 12:05:40,022 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:114)] Channel MemChannel connected to [HDFS]
2015-09-24 12:05:40,031 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3c1cefaa counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} }
2015-09-24 12:05:40,040 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel MemChannel
2015-09-24 12:05:40,218 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean.
2015-09-24 12:05:40,218 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: MemChannel started
2015-09-24 12:05:40,219 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink HDFS
2015-09-24 12:05:40,221 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2015-09-24 12:05:40,221 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SINK, name: HDFS started

这是我的计算机环境的详细信息。

JDK

java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

操作系统

CentOS release 6.4 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
cat: /etc/lsb-release.d: Is a directory
cpe:/o:centos:linux:6:GA

Flume-ng

Flume 1.5.0-cdh5.3.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: cc2139f386f7fccc9a6e105e2026228af58c6e9f
Compiled by jenkins on Tue Dec 16 20:25:18 PST 2014
From source with checksum 0b02653a07c9e96af03ce2189b8d51c3

Hadoop

Hadoop 2.5.0-cdh5.3.0
Subversion http://github.com/cloudera/hadoop -r f19097cda2536da1df41ff6713556c8f7284174d
Compiled by jenkins on 2014-12-17T03:05Z
Compiled with protoc 2.5.0
From source with checksum 9c4267e6915cf5bbd4c6e08be54d54e0
This command was run using /usr/lib/hadoop/hadoop-common-2.5.0-cdh5.3.0.jar

这是 hdfs 报告命令的输出。

Configured Capacity: 20506943488 (19.10 GB)
Present Capacity: 20506943488 (19.10 GB)
DFS Remaining: 20057721155 (18.68 GB)
DFS Used: 449222333 (428.41 MB)
DFS Used%: 2.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (1):

Name: 127.0.0.1:50010 (uat.cloudera)
Hostname: uat.cloudera
Rack: /default
Decommission Status : Normal
Configured Capacity: 20506943488 (19.10 GB)
DFS Used: 449222333 (428.41 MB)
Non DFS Used: 0 (0 B)
DFS Remaining: 20057721155 (18.68 GB)
DFS Used%: 2.19%
DFS Remaining%: 97.81%
Configured Cache Capacity: 4294967296 (4 GB)
Cache Used: 0 (0 B)
Cache Remaining: 4294967296 (4 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 6
Last contact: Thu Sep 25 12:09:42 PDT 2015

最佳答案

您缺少代理的“.sources”属性。在不知道来源的情况下，Flume-ng 如何工作？您缺少以下行。

TwitterAgent.sources = Twitter

参见source、channel和sink关系图。

要查看更多详细信息，请参阅以下链接: https://flume.apache.org/FlumeUserGuide.html

永远记住flume配置文件中主要有三件事(sources、channels、sinks)。前三行设置这三个属性。

TwitterAgent.sources = Twitter
TwitterAgent.sinks = HDFS
TwitterAgent.channels = MemChannel

配置文件的其余部分设置这三个主要内容(sources、channels、sinks)的详细属性。

检查下面更正后的配置文件内容。

TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS

#TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel

TwitterAgent.sources.Twitter.consumerKey = xxxxx
TwitterAgent.sources.Twitter.consumerSecret = xxxxxx
TwitterAgent.sources.Twitter.accessToken = xxxxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxxx

TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing

TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://uat.cloudera:8020/user/root/flume/
    TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 10
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10

TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10
TwitterAgent.channels.MemChannel.transactionCapacity = 10

除了设置sources属性之外，我还更改了以下属性，以便我们可以快速在hdfs上以临时文件形式查看结果。

TwitterAgent.sinks.HDFS.hdfs.batchSize = 10
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10
TwitterAgent.channels.MemChannel.capacity = 10
TwitterAgent.channels.MemChannel.transactionCapacity = 10

复制内容并保存在任何配置文件中，例如/etc/flume-ng/conf/文件夹中的sample.conf，然后使用以下命令。

flume-ng agent -c /etc/flume-ng/conf -f /etc/flume-ng/conf/sample.conf -n TwitterAgent -Dflume.root.logger=INFO,console

HDFS 启动状态后，它应该显示这样的处理消息。

2015-09-25 13:44:18,045 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.source.twitter.TwitterSource.start(TwitterSource.java:139)] Twitter source Twitter started.
2015-09-25 13:44:18,045 (Twitter Stream consumer-1[initializing]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Establishing connection.
2015-09-25 13:44:19,931 (Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Connection established.
2015-09-25 13:44:19,931 (Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Receiving status stream.
2015-09-25 13:44:20,283 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.HDFSDataStream.configure(HDFSDataStream.java:58)] Serializer = TEXT, UseRawLocalFileSystem = false
2015-09-25 13:44:20,557 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp
2015-09-25 13:44:22,435 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 100 docs
2015-09-25 13:44:25,383 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 200 docs
2015-09-25 13:44:28,178 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 300 docs
2015-09-25 13:44:30,505 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:413)] Closing hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp
2015-09-25 13:44:30,506 (hdfs-HDFS-call-runner-2) [INFO - org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:339)] Close tries incremented
2015-09-25 13:44:30,526 (hdfs-HDFS-call-runner-3) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:673)] Renaming hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp to hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284
2015-09-25 13:44:30,607 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp
2015-09-25 13:44:31,157 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 400 docs
2015-09-25 13:44:33,330 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 500 docs
2015-09-25 13:44:36,131 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 600 docs
2015-09-25 13:44:38,298 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 700 docs
2015-09-25 13:44:40,465 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 800 docs
2015-09-25 13:44:41,158 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:413)] Closing hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp
2015-09-25 13:44:41,158 (hdfs-HDFS-call-runner-6) [INFO - org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:339)] Close tries incremented
2015-09-25 13:44:41,166 (hdfs-HDFS-call-runner-7) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:673)] Renaming hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp to hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285
2015-09-25 13:44:41,230 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860286.tmp
2015-09-25 13:44:43,238 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 900 docs
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 1,000 docs
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:300)] Total docs indexed: 1,000, total skipped docs: 0
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:302)]     35 docs/second
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:304)] Run took 28 seconds and processed:
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:306)]     0.009 MB/sec sent to index
2015-09-25 13:44:46,119 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:308)]     0.255 MB text sent to index
2015-09-25 13:44:46,119 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:310)] There were 0 exceptions ignored:
^C2015-09-25 13:44:46,666 (agent-shutdown-hook) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:79)] Stopping lifecycle supervisor 10
2015-09-25 13:44:46,673 (agent-shutdown-hook) [INFO - org.apache.flume.source.twitter.TwitterSource.stop(TwitterSource.java:150)] Twitter source Twitter stopping...

如果您的问题现已解决，请告诉我。

关于flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32790148/

文章推荐： amazon-web-services - 如何删除 AWS CloudWatch 指标？

文章推荐： assembly - Lua 5.1 汇编器存在吗？

文章推荐： function - PolyML 函数和类型

文章推荐： firebase - App Engine 和 Firebase 托管在一个域中

flume - 如何安装和配置Apache Flume？
Closed. This question does not meet Stack Overflow guidelines。它当前不接受答案。想要改善这个问题吗？更新问题，以便将其作为on-topi
hadoop - Flume-NG:如何使用 Flume-ng 自动读取目录中新添加的文件(Flume 代理的来源)
spooldir 选项用于流式传输特定目录的所有文件。完成整个目录读取后，作业将暂停/停止。但是，如果我想将新文件添加到同一目录中，会发生什么？？我的要求是在任何新文件添加到该特定 spooldir
java - Apache Flume/var/log/flume-ng/flume.log(权限被拒绝)
我正在尝试从/home/cloudera/Documents/flume/读取日志文件并使用 apache flume 将其写入 hdfs。我使用以下命令在 hdfs 中创建 flumeLogTest
flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字
我正在尝试使用 MemChannel 和 HDFS 处理一些 Twitter 关键字。但是，在控制台上的 HDFS 启动状态后，flume-ng 没有显示进一步的进度。这是/etc/flume-n
flume - Flume-ng null 事件的自定义接收器
我正在尝试为flume-ng编写一个自定义接收器。我查看了现有的接收器和文档并对其进行了编码。但是，应该接收事件的“process()”方法总是以 null 结束。我正在做 Event event
flume - 如何使用 Flume NG 在控制台上收集日志？
我正在测试 Flume NG (1.2.0) 以收集日志。 Flume收集日志文件flume_test.log的简单测试并将收集到的日志作为 sysout 打印到控制台。 conf/flume.con
flume - 在 flume.conf 中获取变量
我在 flume.con 文件中声明了一个 flume agent。来源是 RabbitMQ，尽管这不是很相关。问题是我需要从那里取出凭证到另一个文件。我看到这样做的方法是在 flume-env.sh
hadoop - Flume - 整个文件可以被视为 Flume 中的事件吗？
我有一个用例，我需要将文件从目录提取到 HDFS。作为 POC，我在 Flume 中使用了简单的目录假脱机，我在其中指定了源、接收器和 channel ，它工作正常。缺点是我必须为进入不同文件夹的多种
flume-ng - 在 Apache Flume 中传输文件时如何保留文件名？
我正在使用 Flume 1.3.1 ng，我正在将文件从 spoolDir 传输到 HDFS Sink，并且我需要与输入文件相同的输出文件名称。例如，如果输入文件名为sample.gz，则输出也需要为
flume-ng - Flume 1.6 kafka源码
kafka_2.10-0.8.2.0 水槽1.6 这是我的水槽配置: a1.sources = r1 a1.sinks = k1 a1.channels = c1 a1
flume - Flume 事件 header 中的预期时间戳，但它为空
我正在使用以下配置详细信息使用 Flume 将 Twitter 提要推送到 HDFS，但在 Flume 事件 header 中获得预期时间戳，但它为空 twitter.conf TwitterAgen
flume - 使用 Flume(spool 目录)将大文件加载到 hdfs
我们将一个 150 mb 的 csv 文件复制到水槽的 spool 目录中，当它被加载到 hdfs 中时，该文件被拆分成更小的文件，例如 80 kb 的文件。有没有办法加载文件而不会使用水槽拆分成更小
hadoop - 每小时将推文保存到单个 Flume 数据文件的 flume.conf 参数应该是多少？
我们将推文保存在目录顺序中，例如/user/flume/2016/06/28/13/FlumeData...。但每小时它会创建超过 100 个 FlumeData 文件。我更改了 TwitterAge
hadoop - 如何在故障转移模式下配置 Flume 1.x (flume-ng)？
有大量关于在 CDH3 中以故障转移模式配置 Flume (0,9x) 节点的信息。但是CDH4中Flume(1.x)配置的配置格式完全不同。如何在故障转移模式下配置 Flume 1.x (flum
hadoop - Flume--找不到主类 : org. apache.flume.tools.GetJavaProperty
我正在使用 cloudera CDH 4.4。当我运行 flume cmd 时 - "bin/flume-ng agent -n agentA -f conf/MultipleFlumes.prope
Flume -flume.root.logger=DEBUG,console 只记录 INFO 级别的日志语句
我在 CentOS(cloudera VM)中安装了 Flume 1.4.0-cdh4.7.0 我运行以下命令来启动水槽 Flume-ng agent -n agent-name -c conf -f
java - FLUME [HADOOP_ORG.APACHE.FLUME.TOOLS.GETJAVAPROPERTY_USER : Bad substitution]
我正在尝试运行典型的 Flume 第一个示例来获取推文并使用 Apache FLume 将它们存储在 HDFS 中。 [Hadoop version 3.1.3; Apache Flume 1.9.0
linux - 异常如下-org.apache.flume.FlumeException : Unable to load source type: com. cloudera.flume
我正在尝试使用 Flume 进行 Twitter 分析。为了从 twitter 获取推文，我在 flume.conf 文件中设置了所有必需的参数(consumerKey、consumerSecret、
linux - 异常(exception)如下。 org.apache.flume.FlumeException : Unable to load source type in flume twitter analysis 异常
我正在尝试使用 Flume 和 Hive 进行 Twitter 分析。为了从 twitter 获取推文，我在 flume.conf 文件中设置了所有必需的参数(consumerKey、consumer
hadoop - 错误 : Could not find or load main class org. apache.flume.node.Application - 在 hadoop 版本 1.2.1 上安装 flume
我搭建了一个hadoop集群，其中一个是master-slave节点，另一个是slave。现在，我想建立一个水槽来获取主机上集群的所有日志。但是，当我尝试从 tarball 安装 flume 时，我总

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字