gpt4 book ai didi

java - hadoop mapreduce teragen FAIL_CONTAINER_CLEANUP

转载 作者:可可西里 更新时间:2023-11-01 15:57:21 25 4
gpt4 key购买 nike

我的 hadoop 集群遇到了一些问题。我试着用它做一些基准测试来检查它的性能,看看 mapreduce 是否工作正常,但我得到了一些奇怪的行为。事实上,mapreduce 正在启动并处理其映射阶段,但我从中得到了一些错误:我首先使用 teragen 来创建数据:

$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar teragen 500 random-data

然后工作开始了,我在没有停止进程的情况下遇到了一些失败:

17/02/23 12:29:27 INFO client.RMProxy: Connecting to ResourceManager at /172.16.138.145:8032

17/02/23 12:29:28 INFO terasort.TeraSort: Generating 500 using 2

17/02/23 12:29:28 INFO mapreduce.JobSubmitter: number of splits:2

17/02/23 12:29:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1487846108320_0007

17/02/23 12:29:28 INFO impl.YarnClientImpl: Submitted application application_1487846108320_0007

17/02/23 12:29:28 INFO mapreduce.Job: The url to track the job: http://172.16.138.145:8088/proxy/application_1487846108320_0007/

17/02/23 12:29:28 INFO mapreduce.Job: Running job: job_1487846108320_0007

17/02/23 12:29:34 INFO mapreduce.Job: Job job_1487846108320_0007 running in uber mode : false

17/02/23 12:29:34 INFO mapreduce.Job: map 0% reduce 0%

17/02/23 12:29:47 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000001_0, Status : FAILED

17/02/23 12:29:48 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000000_0, Status : FAILED

17/02/23 12:30:02 INFO mapreduce.Job: map 50% reduce 0%

17/02/23 12:30:02 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000001_1, Status : FAILED

17/02/23 12:30:03 INFO mapreduce.Job: map 0% reduce 0%

17/02/23 12:30:03 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000000_1, Status : FAILED

17/02/23 12:30:15 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000001_2, Status : FAILED

17/02/23 12:30:16 INFO mapreduce.Job: Task Id : attempt_1487846108320_0007_m_000000_2, Status : FAILED

17/02/23 12:30:30 INFO mapreduce.Job: map 100% reduce 0%

17/02/23 12:30:31 INFO mapreduce.Job: Job job_1487846108320_0007 failed with state FAILED due to: Task failed task_1487846108320_0007_m_000001

Job failed as tasks failed. failedMaps:1 failedReduces:0

我检查了相关数据节点中的日志,发现每次失败都重复以下几行:

2017-02-23 11:36:12,901 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1487846108320_0001_m_000001_1 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP

2017-02-23 11:36:12,901 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1487846108320_0001_m_000001_1:

2017-02-23 11:36:12,902 INFO [ContainerLauncher #5] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1487846108320_0001_01_000004 taskAttempt attempt_1487846108320_0001_m_000001_1

2017-02-23 11:36:12,903 INFO [ContainerLauncher #5] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1487846108320_0001_m_000001_1

2017-02-23 11:36:12,903 INFO [ContainerLauncher #5] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : Datanode3:34121

2017-02-23 11:36:12,923 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1487846108320_0001_m_000001_1 TaskAttempt Transitioned from FAIL_CONTAINER_CLEANUP to FAIL_TASK_CLEANUP

2017-02-23 11:36:12,924 INFO [CommitterEvent Processor #2] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: TASK_ABORT

2017-02-23 11:36:12,932 WARN [CommitterEvent Processor #2] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://172.16.138.145:9000/user/hdfs/random-dataSmallV7.7/_temporary/1/_temporary/attempt_1487846108320_0001_m_000001_1

2017-02-23 11:36:12,932 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1487846108320_0001_m_000001_1 TaskAttempt Transitioned from FAIL_TASK_CLEANUP to FAILED

在这种情况下,作业失败了,但有时我收到错误,但作业会成功。 (很少)你知道这个 FAIL_CONTAINER_CLEANUP 的原因是什么吗?或者这个问题的潜在原因?这里只是使用了mappers,没有请求reducer,但是在其他情况下涉及到reducer时,也会报错。

提前感谢您的想法。

最佳答案

我终于解决了。我在一些引用我的节点的 /etc/hosts 文件中有一行:127.0.1.1 数据节点1

我用我机器的 FQDN 替换了这一行:172.16.138.147 数据节点1

这允许 hadoop 找到我的服务器的引用并修复这个错误。

我希望这对其他人有帮助。

关于java - hadoop mapreduce teragen FAIL_CONTAINER_CLEANUP,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42416921/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com