gpt4 book ai didi

python - python pydoop程序中HADOOP_CONF_DIR not found错误

转载 作者:可可西里 更新时间:2023-11-01 15:05:49 26 4
gpt4 key购买 nike

我正在使用 Pydoop 连接到 python 程序内的 hdfs 文件系统。这个 python 程序尝试在 hdfs 中读取/写入文件。当我尝试执行时出现错误。

用于执行的命令:命令:

hadoop jar /usr/share/bigdata/hadoop-1.2.0/contrib/streaming/hadoop-streaming-1.2.0.jar -file ./Methratio.py -mapper './Methratio.py  -d /user/hadoop/gnome.fa -r -g  -o hdfs://ai-ole6-main.ole6.com:54311/user/hadoop/bsmapout.txt hdfs://ai-ole6-main.ole6.com:54311/user/hadoop/Example.bam ' -input sampleinput.txt -output outfile

错误:

回溯(最近调用最后):

  File "/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201501251859_0001/attempt_201501251859_0001_m_000000_1/work/./Methratio.py", line 2, in <module>
import sys, time, os, array, optparse,pydoop.hdfs as hdfs
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/hdfs/__init__.py", line 98, in <module>
init()
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/hdfs/__init__.py", line 92, in init
pydoop.hadoop_classpath(), _ORIG_CLASSPATH, pydoop.hadoop_conf()
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/__init__.py", line 103, in hadoop_classpath
return _PATH_FINDER.hadoop_classpath(hadoop_home)
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/hadoop_utils.py", line 551, in hadoop_classpath
jars.extend([self.hadoop_native(), self.hadoop_conf()])
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/hadoop_utils.py", line 493, in hadoop_conf
PathFinder.__error("hadoop conf dir", "HADOOP_CONF_DIR")
File "/usr/local/lib/python2.7/site-packages/pydoop-1.0.0_rc1-py2.7.egg/pydoop/hadoop_utils.py", line 385, in __error
raise ValueError("%s not found, try setting %s" % (what, env_var))
ValueError: hadoop conf dir not found, try setting HADOOP_CONF_DIR
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

代码:

with hdfs.open(options.reffile) as hdfsfile:
for line in hdfsfile.open(options.reffile):
if line[0] == '>':
#some processing

最佳答案

HADOOP_CONF_DIR 环境变量必须设置到适当的位置,即包含 core-site.xml、mapred-site.xml、hdfs-site.xml 等文件的文件夹的路径。通常这些文件可以在 中找到hadoop/etc/文件夹。
在我的例子中,我从 tarball 安装了 Hadoop 2.6,并将解压的文件夹放在/usr/local 中。
我在 ~/.bashrc
添加了以下行导出 HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

然后从终端输入命令 source ~/.bashrc

关于python - python pydoop程序中HADOOP_CONF_DIR not found错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28137097/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com