gpt4 book ai didi

python - 使用 hadoop 流和 mrjob 运行作业 : PipeMapRed. waitOutputThreads(): subprocess failed with code 1

转载 作者:太空狗 更新时间:2023-10-29 19:36:00 25 4
gpt4 key购买 nike

嘿,我是大数据世界的新手。我遇到了这个教程 http://musicmachinery.com/2011/09/04/how-to-process-a-million-songs-in-20-minutes/

它详细描述了如何在本地和 Elastic Map Reduce 上使用 mrjob 运行 MapReduce 作业。

好吧,我正在尝试在我自己的 Hadoop cluser 上运行它。我使用以下命令运行该作业。

python density.py tiny.dat -r hadoop --hadoop-bin /usr/bin/hadoop > outputmusic

这就是我得到的:

HADOOP: Running job: job_1369345811890_0245
HADOOP: Job job_1369345811890_0245 running in uber mode : false
HADOOP: map 0% reduce 0%
HADOOP: Task Id : attempt_1369345811890_0245_m_000000_0, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: Task Id : attempt_1369345811890_0245_m_000001_0, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: Task Id : attempt_1369345811890_0245_m_000000_1, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: Container killed by the ApplicationMaster.
HADOOP:
HADOOP:
HADOOP: Task Id : attempt_1369345811890_0245_m_000001_1, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: Task Id : attempt_1369345811890_0245_m_000000_2, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: Task Id : attempt_1369345811890_0245_m_000001_2, Status : FAILED
HADOOP: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
HADOOP: at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
HADOOP: at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
HADOOP: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
HADOOP: at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
HADOOP: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
HADOOP: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
HADOOP: at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
HADOOP: at java.security.AccessController.doPrivileged(Native Method)
HADOOP: at javax.security.auth.Subject.doAs(Subject.java:415)
HADOOP: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
HADOOP: at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
HADOOP:
HADOOP: map 100% reduce 0%
HADOOP: Job job_1369345811890_0245 failed with state FAILED due to: Task failed task_1369345811890_0245_m_000001
HADOOP: Job failed as tasks failed. failedMaps:1 failedReduces:0
HADOOP:
HADOOP: Counters: 6
HADOOP: Job Counters
HADOOP: Failed map tasks=7
HADOOP: Launched map tasks=8
HADOOP: Other local map tasks=6
HADOOP: Data-local map tasks=2
HADOOP: Total time spent by all maps in occupied slots (ms)=32379
HADOOP: Total time spent by all reduces in occupied slots (ms)=0
HADOOP: Job not Successful!
HADOOP: Streaming Command Failed!
STDOUT: packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.0-cdh4.2.1.jar] /tmp/streamjob3272348678857116023.jar tmpDir=null
Traceback (most recent call last):
File "density.py", line 34, in <module>
MRDensity.run()
File "/usr/lib/python2.6/site-packages/mrjob-0.2.4-py2.6.egg/mrjob/job.py", line 344, in run
mr_job.run_job()
File "/usr/lib/python2.6/site-packages/mrjob-0.2.4-py2.6.egg/mrjob/job.py", line 381, in run_job
runner.run()
File "/usr/lib/python2.6/site-packages/mrjob-0.2.4-py2.6.egg/mrjob/runner.py", line 316, in run
self._run()
File "/usr/lib/python2.6/site-packages/mrjob-0.2.4-py2.6.egg/mrjob/hadoop.py", line 175, in _run
self._run_job_in_hadoop()
File "/usr/lib/python2.6/site-packages/mrjob-0.2.4-py2.6.egg/mrjob/hadoop.py", line 325, in _run_job_in_hadoop
raise CalledProcessError(step_proc.returncode, streaming_args)
subprocess.CalledProcessError: Command '['/usr/bin/hadoop', 'jar', '/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.2.1.jar', '-cmdenv', 'PYTHONPATH=mrjob.tar.gz', '-input', 'hdfs:///user/E824259/tmp/mrjob/density.E824259.20130611.053850.343441/input', '-output', 'hdfs:///user/E824259/tmp/mrjob/density.E824259.20130611.053850.343441/output', '-cacheFile', 'hdfs:///user/E824259/tmp/mrjob/density.E824259.20130611.053850.343441/files/density.py#density.py', '-cacheArchive', 'hdfs:///user/E824259/tmp/mrjob/density.E824259.20130611.053850.343441/files/mrjob.tar.gz#mrjob.tar.gz', '-mapper', 'python density.py --step-num=0 --mapper --protocol json --output-protocol json --input-protocol raw_value', '-jobconf', 'mapred.reduce.tasks=0']' returned non-zero exit status 1

注意:正如我在其他一些论坛中所建议的那样

#! /usr/bin/python

在我的 python 文件 density.py 和 track.py 的开头。它似乎对大多数人都有效,但我仍然继续得到上述异常(exception)情况。

编辑:我在 density.py 本身的另一个文件 track.py 中定义了原始 density.py 中使用的函数之一的定义。作业成功运行。但如果有人知道为什么会发生这种情况,那将非常有帮助。

最佳答案

错误代码 1 是 Hadoop Streaming 的一般错误。出现此错误代码的主要原因有两个:

  • 您的 Mapper 和 Reducer 脚本不可执行(包括脚本开头的 #!/usr/bin/python)。

  • 您的 Python 程序只是写错了 - 您可能有语法错误或逻辑错误。

不幸的是,错误代码 1 没有提供任何详细信息来帮助您了解您的 Python 程序到底出了什么问题。

我自己有一段时间被错误代码 1 困住了,我想出来的方法是简单地将我的 Mapper 脚本作为一个独立的 python 程序运行:python mapper.py

这样做之后,我收到了一个常规的 Python 错误,告诉我我只是给函数提供了错误类型的参数。我修复了我的语法错误,之后一切正常。因此,如果可能的话,我会将您的 Mapper 或 Reducer 脚本作为独立的 Python 程序运行,以查看是否能让您深入了解错误的原因。

关于python - 使用 hadoop 流和 mrjob 运行作业 : PipeMapRed. waitOutputThreads(): subprocess failed with code 1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17037300/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com