gpt4 book ai didi

r - 使用hadoop流运行R脚本作业失败:PipeMapRed.waitOutputThreads():子进程失败,代码为1

转载 作者:行者123 更新时间:2023-12-02 21:59:51 35 4
gpt4 key购买 nike

我有一个R脚本,可以在R Colsole中很好地工作,但是当我在Hadoop流中运行时,它在Map阶段失败并出现以下错误。

我有Hadoop Streaming命令:

/home/Bibhu/hadoop-0.20.2/bin/hadoop jar \
/home/Bibhu/hadoop-0.20.2/contrib/streaming/*.jar \
-input hdfs://localhost:54310/user/Bibhu/BookTE1.csv \
-output outsid -mapper `pwd`/code1.sh

标准日志
Loading required package: class
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
Calls: read.csv -> read.table
Execution halted
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

系统日志
2013-07-03 19:32:36,080 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-07-03 19:32:36,654 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2013-07-03 19:32:36,675 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2013-07-03 19:32:36,835 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720
2013-07-03 19:32:36,835 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680
2013-07-03 19:32:36,899 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/home/Bibhu/Downloads/SentimentAnalysis/Sid/smallFile/code1.sh]
2013-07-03 19:32:37,256 INFO org.apache.hadoop.streaming.PipeMapRed: Records R/W=0/1
2013-07-03 19:32:38,509 INFO org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
2013-07-03 19:32:38,509 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed!
2013-07-03 19:32:38,557 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2013-07-03 19:32:38,631 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the task

最佳答案

  • 写hadoopStreaming jar与hadoop-streaming-1.0.4.jar一样的完整版本
  • 使用-file选项
  • 为mapper和reducer指定单独的文件路径
  • 通过-mapper和-reducer选项
  • 告诉hadoop这是您的映射器和化简器代码

    有关更多引用,请参见 Running WordCount on Hadoop using R script

    关于r - 使用hadoop流运行R脚本作业失败:PipeMapRed.waitOutputThreads():子进程失败,代码为1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17450816/

    35 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com