- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试使用 MemChannel
和 HDFS
处理一些 Twitter 关键字。但是,在控制台上的 HDFS 启动
状态后,flume-ng
没有显示进一步的进度。
这是/etc/flume-ns/conf/flume-env.sh
文件内容。
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced during Flume startup.
# Environment variables can be set here.
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
# export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"
# Note that the Flume conf directory is always included in the classpath.
#FLUME_CLASSPATH=""
这是 twitter 配置文件内容。
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
#TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.consumerSecret = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessToken = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientist, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://uat.cloudera:8020/user/root/flume/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
我正在 centOs 控制台上运行以下命令。
flume-ng agent -c /etc/flume-ng/conf -f /etc/flume-ng/conf/twitter.conf -n TwitterAgent -Dflume.root.logger=INFO,console
当我运行命令时,这里是输出。
Info: Sourcing environment configuration script /etc/flume-ng/conf/flume-env.sh
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12.jar from classpath
+ exec /usr/java/jdk1.7.0_67-cloudera/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp r/lib/flume-ng/../search/lib/xmlbeans-2.3.0.jar:/usr/lib/flume-ng/../search/lib/xmlenc-0.52.jar:/usr/lib/flume-ng/../search/lib/xmpcore-5.1.2.jar:/usr/lib/flume-ng/../search/lib/xz-1.0.jar:/usr/lib/flume-ng/../search/lib/zookeeper.jar' -Djava.library.path=:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native org.apache.flume.node.Application -f /etc/flume-ng/conf/farrukh.conf -n TwitterAgent
2015-09-24 12:05:38,876 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2015-09-24 12:05:38,885 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:/etc/flume-ng/conf/farrukh.conf
2015-09-24 12:05:38,896 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,896 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:931)] Added sinks: HDFS Agent: TwitterAgent
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,897 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,898 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:HDFS
2015-09-24 12:05:38,911 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:508)] Agent configuration for 'TwitterAgent' has no sources.
2015-09-24 12:05:38,919 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:141)] Post-validation flume configuration contains configuration for agents: [TwitterAgent]
2015-09-24 12:05:38,920 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:145)] Creating channels
2015-09-24 12:05:38,939 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel MemChannel type memory
2015-09-24 12:05:38,957 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:200)] Created channel MemChannel
2015-09-24 12:05:38,963 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: HDFS, type: hdfs
2015-09-24 12:05:40,019 (conf-file-poller-0) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:559)] Hadoop Security enabled: false
2015-09-24 12:05:40,022 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:114)] Channel MemChannel connected to [HDFS]
2015-09-24 12:05:40,031 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3c1cefaa counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} }
2015-09-24 12:05:40,040 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel MemChannel
2015-09-24 12:05:40,218 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean.
2015-09-24 12:05:40,218 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: MemChannel started
2015-09-24 12:05:40,219 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink HDFS
2015-09-24 12:05:40,221 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2015-09-24 12:05:40,221 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SINK, name: HDFS started
这是我的计算机环境的详细信息。
JDK
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
操作系统
CentOS release 6.4 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
cat: /etc/lsb-release.d: Is a directory
cpe:/o:centos:linux:6:GA
Flume-ng
Flume 1.5.0-cdh5.3.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: cc2139f386f7fccc9a6e105e2026228af58c6e9f
Compiled by jenkins on Tue Dec 16 20:25:18 PST 2014
From source with checksum 0b02653a07c9e96af03ce2189b8d51c3
Hadoop
Hadoop 2.5.0-cdh5.3.0
Subversion http://github.com/cloudera/hadoop -r f19097cda2536da1df41ff6713556c8f7284174d
Compiled by jenkins on 2014-12-17T03:05Z
Compiled with protoc 2.5.0
From source with checksum 9c4267e6915cf5bbd4c6e08be54d54e0
This command was run using /usr/lib/hadoop/hadoop-common-2.5.0-cdh5.3.0.jar
这是 hdfs 报告命令的输出。
Configured Capacity: 20506943488 (19.10 GB)
Present Capacity: 20506943488 (19.10 GB)
DFS Remaining: 20057721155 (18.68 GB)
DFS Used: 449222333 (428.41 MB)
DFS Used%: 2.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (uat.cloudera)
Hostname: uat.cloudera
Rack: /default
Decommission Status : Normal
Configured Capacity: 20506943488 (19.10 GB)
DFS Used: 449222333 (428.41 MB)
Non DFS Used: 0 (0 B)
DFS Remaining: 20057721155 (18.68 GB)
DFS Used%: 2.19%
DFS Remaining%: 97.81%
Configured Cache Capacity: 4294967296 (4 GB)
Cache Used: 0 (0 B)
Cache Remaining: 4294967296 (4 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 6
Last contact: Thu Sep 25 12:09:42 PDT 2015
最佳答案
您缺少代理的“.sources”属性。在不知道来源的情况下,Flume-ng 如何工作?您缺少以下行。
TwitterAgent.sources = Twitter
要查看更多详细信息,请参阅以下链接: https://flume.apache.org/FlumeUserGuide.html
永远记住flume配置文件中主要有三件事(sources、channels、sinks)。前三行设置这三个属性。
TwitterAgent.sources = Twitter
TwitterAgent.sinks = HDFS
TwitterAgent.channels = MemChannel
配置文件的其余部分设置这三个主要内容(sources
、channels
、sinks
)的详细属性。
检查下面更正后的配置文件内容。
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS
#TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = xxxxx
TwitterAgent.sources.Twitter.consumerSecret = xxxxxx
TwitterAgent.sources.Twitter.accessToken = xxxxx
TwitterAgent.sources.Twitter.accessTokenSecret = xxxxx
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://uat.cloudera:8020/user/root/flume/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 10
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10
TwitterAgent.channels.MemChannel.transactionCapacity = 10
除了设置sources
属性之外,我还更改了以下属性,以便我们可以快速在hdfs上以临时文件形式查看结果。
TwitterAgent.sinks.HDFS.hdfs.batchSize = 10
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10
TwitterAgent.channels.MemChannel.capacity = 10
TwitterAgent.channels.MemChannel.transactionCapacity = 10
复制内容并保存在任何配置文件中,例如/etc/flume-ng/conf/文件夹中的sample.conf,然后使用以下命令。
flume-ng agent -c /etc/flume-ng/conf -f /etc/flume-ng/conf/sample.conf -n TwitterAgent -Dflume.root.logger=INFO,console
HDFS 启动状态后,它应该显示这样的处理消息。
2015-09-25 13:44:18,045 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.source.twitter.TwitterSource.start(TwitterSource.java:139)] Twitter source Twitter started.
2015-09-25 13:44:18,045 (Twitter Stream consumer-1[initializing]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Establishing connection.
2015-09-25 13:44:19,931 (Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Connection established.
2015-09-25 13:44:19,931 (Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] Receiving status stream.
2015-09-25 13:44:20,283 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.HDFSDataStream.configure(HDFSDataStream.java:58)] Serializer = TEXT, UseRawLocalFileSystem = false
2015-09-25 13:44:20,557 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp
2015-09-25 13:44:22,435 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 100 docs
2015-09-25 13:44:25,383 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 200 docs
2015-09-25 13:44:28,178 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 300 docs
2015-09-25 13:44:30,505 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:413)] Closing hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp
2015-09-25 13:44:30,506 (hdfs-HDFS-call-runner-2) [INFO - org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:339)] Close tries incremented
2015-09-25 13:44:30,526 (hdfs-HDFS-call-runner-3) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:673)] Renaming hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284.tmp to hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860284
2015-09-25 13:44:30,607 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp
2015-09-25 13:44:31,157 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 400 docs
2015-09-25 13:44:33,330 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 500 docs
2015-09-25 13:44:36,131 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 600 docs
2015-09-25 13:44:38,298 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 700 docs
2015-09-25 13:44:40,465 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 800 docs
2015-09-25 13:44:41,158 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:413)] Closing hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp
2015-09-25 13:44:41,158 (hdfs-HDFS-call-runner-6) [INFO - org.apache.flume.sink.hdfs.BucketWriter$3.call(BucketWriter.java:339)] Close tries incremented
2015-09-25 13:44:41,166 (hdfs-HDFS-call-runner-7) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:673)] Renaming hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285.tmp to hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860285
2015-09-25 13:44:41,230 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://uat.cloudera:8020/user/root/flume/FlumeData.1443213860286.tmp
2015-09-25 13:44:43,238 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 900 docs
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.onStatus(TwitterSource.java:178)] Processed 1,000 docs
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:300)] Total docs indexed: 1,000, total skipped docs: 0
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:302)] 35 docs/second
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:304)] Run took 28 seconds and processed:
2015-09-25 13:44:46,118 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:306)] 0.009 MB/sec sent to index
2015-09-25 13:44:46,119 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:308)] 0.255 MB text sent to index
2015-09-25 13:44:46,119 (Twitter4J Async Dispatcher[0]) [INFO - org.apache.flume.source.twitter.TwitterSource.logStats(TwitterSource.java:310)] There were 0 exceptions ignored:
^C2015-09-25 13:44:46,666 (agent-shutdown-hook) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:79)] Stopping lifecycle supervisor 10
2015-09-25 13:44:46,673 (agent-shutdown-hook) [INFO - org.apache.flume.source.twitter.TwitterSource.stop(TwitterSource.java:150)] Twitter source Twitter stopping...
如果您的问题现已解决,请告诉我。
关于flume-ng - Flume 不使用 Hadoop 2.5 cdh5.3 使用 Flume-ng 处理来自 Twitter 源的关键字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32790148/
Closed. This question does not meet Stack Overflow guidelines。它当前不接受答案。 想要改善这个问题吗?更新问题,以便将其作为on-topi
spooldir 选项用于流式传输特定目录的所有文件。完成整个目录读取后,作业将暂停/停止。但是,如果我想将新文件添加到同一目录中,会发生什么?? 我的要求是在任何新文件添加到该特定 spooldir
我正在尝试从/home/cloudera/Documents/flume/读取日志文件并使用 apache flume 将其写入 hdfs。我使用以下命令在 hdfs 中创建 flumeLogTest
我正在尝试使用 MemChannel 和 HDFS 处理一些 Twitter 关键字。但是,在控制台上的 HDFS 启动 状态后,flume-ng 没有显示进一步的进度。 这是/etc/flume-n
我正在尝试为flume-ng编写一个自定义接收器。我查看了现有的接收器和文档并对其进行了编码。但是,应该接收事件的“process()”方法总是以 null 结束。 我正在做 Event event
我正在测试 Flume NG (1.2.0) 以收集日志。 Flume收集日志文件flume_test.log的简单测试并将收集到的日志作为 sysout 打印到控制台。 conf/flume.con
我在 flume.con 文件中声明了一个 flume agent。来源是 RabbitMQ,尽管这不是很相关。问题是我需要从那里取出凭证到另一个文件。我看到这样做的方法是在 flume-env.sh
我有一个用例,我需要将文件从目录提取到 HDFS。作为 POC,我在 Flume 中使用了简单的目录假脱机,我在其中指定了源、接收器和 channel ,它工作正常。缺点是我必须为进入不同文件夹的多种
我正在使用 Flume 1.3.1 ng,我正在将文件从 spoolDir 传输到 HDFS Sink,并且我需要与输入文件相同的输出文件名称。例如,如果输入文件名为sample.gz,则输出也需要为
kafka_2.10-0.8.2.0 水槽1.6 这是我的水槽配置: a1.sources = r1 a1.sinks = k1 a1.channels = c1 a1
我正在使用以下配置详细信息使用 Flume 将 Twitter 提要推送到 HDFS,但在 Flume 事件 header 中获得预期时间戳,但它为空 twitter.conf TwitterAgen
我们将一个 150 mb 的 csv 文件复制到水槽的 spool 目录中,当它被加载到 hdfs 中时,该文件被拆分成更小的文件,例如 80 kb 的文件。有没有办法加载文件而不会使用水槽拆分成更小
我们将推文保存在目录顺序中,例如/user/flume/2016/06/28/13/FlumeData...。但每小时它会创建超过 100 个 FlumeData 文件。我更改了 TwitterAge
有大量关于在 CDH3 中以故障转移模式配置 Flume (0,9x) 节点的信息。 但是CDH4中Flume(1.x)配置的配置格式完全不同。如何在故障转移模式下配置 Flume 1.x (flum
我正在使用 cloudera CDH 4.4。当我运行 flume cmd 时 - "bin/flume-ng agent -n agentA -f conf/MultipleFlumes.prope
我在 CentOS(cloudera VM)中安装了 Flume 1.4.0-cdh4.7.0 我运行以下命令来启动水槽 Flume-ng agent -n agent-name -c conf -f
我正在尝试运行典型的 Flume 第一个示例来获取推文并使用 Apache FLume 将它们存储在 HDFS 中。 [Hadoop version 3.1.3; Apache Flume 1.9.0
我正在尝试使用 Flume 进行 Twitter 分析。为了从 twitter 获取推文,我在 flume.conf 文件中设置了所有必需的参数(consumerKey、consumerSecret、
我正在尝试使用 Flume 和 Hive 进行 Twitter 分析。为了从 twitter 获取推文,我在 flume.conf 文件中设置了所有必需的参数(consumerKey、consumer
我搭建了一个hadoop集群,其中一个是master-slave节点,另一个是slave。现在,我想建立一个水槽来获取主机上集群的所有日志。但是,当我尝试从 tarball 安装 flume 时,我总
我是一名优秀的程序员,十分优秀!