- android - RelativeLayout 背景可绘制重叠内容
- android - 如何链接 cpufeatures lib 以获取 native android 库?
- java - OnItemClickListener 不起作用,但 OnLongItemClickListener 在自定义 ListView 中起作用
- java - Android 文件转字符串
我已经在一个双节点集群上安装了 hadoop。第一个节点“namenode”运行以下守护进程:
hadoop@namenode:~$ jps
2916 SecondaryNameNode
2692 NameNode
3159 NodeManager
5834 Jps
2771 DataNode
3076 ResourceManager
秒节点“datanode”运行以下守护进程:
hadoop@datanode:~$ jps
2559 Jps
2087 DataNode
2198 NodeManager
在我在两台机器上添加的 /etc/hosts
文件中:
10.240.40.246 namenode
10.240.172.201 datanode
哪些是相应的 ip,我检查我是否可以从每台机器通过 ssh 连接到任何其他机器。现在,我想通过执行示例 map reduce 基准测试作业来测试我的 hadoop 安装:
hadoop@namenode:~$ hadoop jar /opt/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 20 -fileSize 10
但是这个工作失败了:
14/02/17 22:22:53 INFO fs.TestDFSIO: TestDFSIO.1.7
14/02/17 22:22:53 INFO fs.TestDFSIO: nrFiles = 20
14/02/17 22:22:53 INFO fs.TestDFSIO: nrBytes (MB) = 10.0
14/02/17 22:22:53 INFO fs.TestDFSIO: bufferSize = 1000000
14/02/17 22:22:53 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
14/02/17 22:22:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/17 22:22:55 INFO fs.TestDFSIO: creating control file: 10485760 bytes, 20 files
14/02/17 22:22:56 INFO fs.TestDFSIO: created control files for: 20 files
14/02/17 22:22:56 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/17 22:22:56 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/17 22:22:57 INFO mapred.FileInputFormat: Total input paths to process : 20
14/02/17 22:22:57 INFO mapreduce.JobSubmitter: number of splits:20
14/02/17 22:22:57 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/02/17 22:22:57 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/02/17 22:22:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392675199090_0001
14/02/17 22:22:59 INFO impl.YarnClientImpl: Submitted application application_1392675199090_0001 to ResourceManager at /0.0.0.0:8032
14/02/17 22:22:59 INFO mapreduce.Job: The url to track the job: http://namenode.c.forward-camera-473.internal:8088/proxy/application_1392675199090_0001/
14/02/17 22:22:59 INFO mapreduce.Job: Running job: job_1392675199090_0001
14/02/17 22:23:10 INFO mapreduce.Job: Job job_1392675199090_0001 running in uber mode : false
14/02/17 22:23:10 INFO mapreduce.Job: map 0% reduce 0%
14/02/17 22:23:42 INFO mapreduce.Job: map 20% reduce 0%
14/02/17 22:23:43 INFO mapreduce.Job: map 30% reduce 0%
14/02/17 22:24:14 INFO mapreduce.Job: map 60% reduce 0%
14/02/17 22:24:41 INFO mapreduce.Job: map 60% reduce 20%
14/02/17 22:24:45 INFO mapreduce.Job: map 85% reduce 20%
14/02/17 22:24:48 INFO mapreduce.Job: map 85% reduce 28%
14/02/17 22:24:59 INFO mapreduce.Job: map 90% reduce 28%
14/02/17 22:25:00 INFO mapreduce.Job: map 90% reduce 30%
14/02/17 22:25:02 INFO mapreduce.Job: map 100% reduce 30%
14/02/17 22:25:03 INFO mapreduce.Job: map 100% reduce 100%
14/02/17 22:25:16 INFO mapreduce.Job: map 0% reduce 0%
14/02/17 22:25:16 INFO mapreduce.Job: Job job_1392675199090_0001 failed with state FAILED due to: Application application_1392675199090_0001 failed 2 times due to AM Container for appattempt_1392675199090_0001_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
.Failing this attempt.. Failing the application.
14/02/17 22:25:16 INFO mapreduce.Job: Counters: 0
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:443)
at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:425)
at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:755)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:650)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:115)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
查看我在机器 datanode
上找到的日志文件:
hadoop@datanode:/opt/hadoop-2.2.0/logs$ cat yarn-hadoop-nodemanager-datanode.log
...
2014-02-17 22:29:33,432 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
在我的名字节点上我做了:
hadoop@namenode:/opt/hadoop-2.2.0/logs$ cat yarn-hadoop-*log
2014-02-17 22:13:20,833 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: STARTUP_MSG:
...
2014-02-17 22:13:25,240 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: The Auxilurary Service named 'mapreduce_shuffle' in the configuration is for class class org.apache.hadoop.mapred.ShuffleHandler which has a name of 'httpshuffle'. Because these are not the same tools trying to send ServiceData and read Service Meta Data may have issues unless the refer to the name in the config.
...
2014-02-17 22:13:25,505 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (3.6 G). Thrashing might happen.
...
2014-02-17 22:24:48,779 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1392675199090_0001_01_000023
2014-02-17 22:24:48,779 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1392675199090_0001_01_000024
...
2014-02-17 22:25:15,733 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1392675199090_0001_02_000001 is : 1
2014-02-17 22:25:15,734 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1392675199090_0001_02_000001 and exit code: 1
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
...
2014-02-17 22:25:15,736 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1
...
2014-02-17 22:25:15,751 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1392675199090_0001 CONTAINERID=container_1392675199090_0001_02_000001
...
2014-02-17 22:13:19,150 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: STARTUP_MSG:
...
2014-02-17 22:25:15,837 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1392675199090_0001 failed 2 times due to AM Container for appattempt_1392675199090_0001_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
.Failing this attempt.. Failing the application. APPID=application_1392675199090_0001
但是,我在机器 namenode
上检查端口 8031 正在监听。我得到:
hadoop@namenode:~$ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 namenode.c.forwar:36975 metadata.google.in:http TIME_WAIT
tcp 0 0 namenode.c.forwar:36969 metadata.google.in:http TIME_WAIT
tcp 0 0 namenode.c.forwar:40616 namenode.c.forwar:10001 TIME_WAIT
tcp 0 0 namenode.c.forwar:36974 metadata.google.in:http ESTABLISHED
tcp 0 0 namenode.c.forward:8031 namenode.c.forwar:41229 ESTABLISHED
tcp 0 352 namenode.c.forward-:ssh e178064245.adsl.a:64305 ESTABLISHED
tcp 0 0 namenode.c.forwar:41229 namenode.c.forward:8031 ESTABLISHED
tcp 0 0 namenode.c.forwar:40365 namenode.c.forwar:10001 ESTABLISHED
tcp 0 0 namenode.c.forwar:10001 namenode.c.forwar:40365 ESTABLISHED
tcp 0 0 namenode.c.forwar:10001 datanode:48786 ESTABLISHED
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags Type State I-Node Path
unix 10 [ ] DGRAM 4604 /dev/log
unix 2 [ ] STREAM CONNECTED 10490
unix 2 [ ] STREAM CONNECTED 10488
unix 2 [ ] STREAM CONNECTED 10452
unix 2 [ ] STREAM CONNECTED 8452
unix 2 [ ] STREAM CONNECTED 7800
unix 2 [ ] STREAM CONNECTED 7797
unix 2 [ ] STREAM CONNECTED 6762
unix 2 [ ] STREAM CONNECTED 6702
unix 2 [ ] STREAM CONNECTED 6698
unix 2 [ ] STREAM CONNECTED 6208
unix 2 [ ] DGRAM 5750
unix 2 [ ] DGRAM 5737
unix 2 [ ] DGRAM 5734
unix 3 [ ] STREAM CONNECTED 5643
unix 3 [ ] STREAM CONNECTED 5642
unix 2 [ ] DGRAM 5640
unix 2 [ ] DGRAM 5192
unix 2 [ ] DGRAM 5171
unix 2 [ ] DGRAM 4889
unix 2 [ ] DGRAM 4723
unix 2 [ ] DGRAM 4663
unix 3 [ ] DGRAM 3132
unix 3 [ ] DGRAM 3131
所以,这里可能是什么问题。在我看来,一切都设置得很好。为什么我的工作失败了?
最佳答案
datanode
上的日志说
Retrying connect to server: 0.0.0.0/0.0.0.0:8031
所以它会尝试连接到本地机器上的这个端口,也就是datanode
。但是,该服务在 namenode
上运行。因此,必须将以下配置行添加到 yarn-site.xml
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>namenode:8088</value>
</property>
其中 namenode
是 /etc/hosts
中运行资源管理器守护进程的机器的别名。
关于exception - 简单的 YARN 基准测试 TestDFSIO 失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21840771/
我正在使用 apache 提供的基准文件 TestDFSIO 测试我的 hadoop 配置。我正在根据本教程(资源 1)运行它: http://www.michael-noll.com/blog/20
我设置了一个双节点 hadoop 集群。启动集群后,它看起来像这样: 机器namenode: hadoop@namenode:~$ jps 5691 Jps 3531 DataNode 3424 Na
是否可以以非 hdfs 用户身份执行 TestDFSIO 基准测试?此基准测试试图创建一个/benchmarks 目录,但由于缺少权限而失败。有没有办法让这个基准测试使用我的 hdfs home 来存
我已经在一个双节点集群上安装了 hadoop。第一个节点“namenode”运行以下守护进程: hadoop@namenode:~$ jps 2916 SecondaryNameNode 2692 N
我有一个包含 11 个节点的集群,其中 9 个是从节点,2 个是主节点,与 my previous question 中的相同.我正在这个使用 CDH 5.8.0 的集群上执行 TestDFSIO 基
环境详情: 操作系统:CentOS 7.2CDH:CDH 5.8.0主机:11(2个master,4个DN+NM,5个NM) yarn.nodemanager.resource.memory-mb 3
我是 hadoop 的新手。我想对 hadoop 集群进行压力/性能测试。为此,我按照 Hadoop benchmarking 给出的说明进行操作。 .不同之处在于,在教程中他谈论的是 hadoop
我是一名优秀的程序员,十分优秀!