gpt4 book ai didi

r - Hadoop 流式传输在 RHadoop 中失败,错误代码为 1

转载 作者:可可西里 更新时间:2023-11-01 15:49:55 24 4
gpt4 key购买 nike

我正在通过以下代码使用 RHadoop:

Sys.setenv(HADOOP_OPTS="-Djava.library.path=/usr/local/hadoop/lib/native")
Sys.setenv(HADOOP_HOME="/usr/local/hadoop")
Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-3.0.0.jar")
Sys.setenv(JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64")

library(rJava)
library(rhdfs)
library(rmr2)
hdfs.init()

mapper = function (., X) {
n=nrow(X);
ones=matrix(rep(1,n),nrow=n,ncol=1);
ag=aggregate(cbind(ones,X[,1:79]),by=list(X[,80]),FUN="sum")
key=factor(ag[,1]);
keyval(key,split(ag[,-1],key))
}

reducer = function(k, A) {
keyval(k,list(Reduce('+', A)))
}

GroupSums <- from.dfs( mapreduce(input = "/ISCXFlowMeter.csv", map = mapper, reduce = reducer, combine = T))

当我运行这段代码时,出现如下错误:

packageJobJar: [/tmp/hadoop-unjar7138506441946536619/] [] /tmp/streamjob6099552934186757596.jar tmpDir=null 2018-06-12 22:40:04,651 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 2018-06-12 22:40:04,945 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 2018-06-12 22:40:05,201 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/uel/.staging/job_1528838017005_0012 2018-06-12 22:40:06,158 INFO mapred.FileInputFormat: Total input files to process : 1 2018-06-12 22:40:06,171 INFO net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:9866 2018-06-12 22:40:06,233 INFO mapreduce.JobSubmitter: number of splits:2 2018-06-12 22:40:06,348 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 2018-06-12 22:40:06,608 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1528838017005_0012 2018-06-12 22:40:06,610 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2018-06-12 22:40:06,945 INFO conf.Configuration: resource-types.xml not found 2018-06-12 22:40:06,945 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2018-06-12 22:40:07,022 INFO impl.YarnClientImpl: Submitted application application_1528838017005_0012 2018-06-12 22:40:07,249 INFO mapreduce.Job: The url to track the job: http://uel-Deskop-VM:8088/proxy/application_1528838017005_0012/ 2018-06-12 22:40:07,251 INFO mapreduce.Job: Running job: job_1528838017005_0012 2018-06-12 22:40:09,301 INFO mapreduce.Job: Job job_1528838017005_0012 running in uber mode : false 2018-06-12 22:40:09,305 INFO mapreduce.Job: map 0% reduce 0% 2018-06-12 22:40:09,337 INFO mapreduce.Job: Job job_1528838017005_0012 failed with state FAILED due to: Application application_1528838017005_0012 failed 2 times due to AM Container for appattempt_1528838017005_0012_000002 exited with exitCode: 127 Failing this attempt.Diagnostics: [2018-06-12 22:40:08.734]Exception from container-launch. Container id: container_1528838017005_0012_02_000001 Exit code: 127

[2018-06-12 22:40:08.736]Container exited with a non-zero exit code 127. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : /bin/bash: /bin/java: No such file or directory

[2018-06-12 22:40:08.736]Container exited with a non-zero exit code 127. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : /bin/bash: /bin/java: No such file or directory

For more detailed output, check the application tracking page: http://uel-Deskop-VM:8088/cluster/app/application_1528838017005_0012 Then click on links to logs of each attempt. . Failing the application. 2018-06-12 22:40:09,368 INFO mapreduce.Job: Counters: 0 2018-06-12 22:40:09,369 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 >

hadoop 中的 ISCXFlowMeter.csv 文件可在此处获得:https://www.dropbox.com/s/rbppzg6x2slzcjz/ISCXFlowMeter.csv?dl=1

你能指导我如何解决这个问题吗?

最佳答案

一段时间后,通过将以下属性添加到 mapred-site.xml 中,我可以纠正错误。

<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>

但是,现在的问题是,完成map-reduce后key-value为NULL。任何帮助,我很感激。

关于r - Hadoop 流式传输在 RHadoop 中失败,错误代码为 1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50826293/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com