格式化我的 hdfs 后,我收到以下错误:
2015-05-28 21:41:57,544 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop/dfs/datanode: namenode clusterID = CID-e77ee39a-ab4a-4de1-b1a4-9d4da78b83e8; datanode clusterID = CID-6c250e90-658c-4363-9346-972330ff8bf9
2015-05-28 21:41:57,545 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000. Exiting.
java.io.IOException: All specified directories are failed to load.
at.. org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1387)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1352)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:316)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:852)
at java.lang.Thread.run(Thread.java:745)
...blah...
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at der-Inspiron-3521/127.0.1.1
************************************************************/
以下是我执行的步骤:
sbin/stop-dfs.sh
hdfs namenode -format
sbin/start-dfs.sh
供您引用:我的 core-site.xml 的临时目录如下:
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop</value>
<description>A base for other temporary directories.
</description>
</property>
我的 hdfs-site.xml 作为名称节点和数据节点如下:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/datanode</value>
</property>
更新:我对这个问题有了更深入的了解,但我仍然遇到相同类型的错误。我能够运行 hdfs dfs -format
并按照建议更改版本 。之后,我使用 hdfs dfs -ls 和 hdfs dfs -mkdir 创建/user/der
,其中 der 是我的登录名。但是,当我运行我的 pig 文件,我在我的 pig 文件中获取 mkDirs 和 chmod 错误。 以下是我的数据节点和名称节点的权限:
drwx------ 3 der der 4096 May 29 08:13 datanode
drwxrwxrwx 4 root root 4096 May 28 11:34 name
drwxrwxr-x 3 der der 4096 May 29 08:13 namenode
drwxrwxr-x 3 der der 4096 May 29 08:13 namesecondary
drwxr-xr-x 2 root root 4096 May 28 11:46 ww
似乎datanode只有所有者和组的权限,而不是用户。
这是我的 pig 脚本错误:
2015-05-29 08:37:27,152 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:totalmiles.pig got an error while submitting
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:724)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:600)
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:94)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl
这是我的 pig 脚本:
records = LOAD '1987.csv' USING PigStorage(',') AS
(Year, Month, DayofMonth, DayOfWeek,
DepTime, CRSDepTime, ArrTime, CRSArrTime,
UniqueCarrier, FlightNum, TailNum,ActualElapsedTime,
CRSElapsedTime,AirTime,ArrDelay, DepDelay,
Origin, Dest, Distance:int, TaxIn,
TaxiOut, Cancelled,CancellationCode, Diverted,
CarrierDelay, WeatherDelay, NASDelay, SecurityDelay,
lateAircraftDelay);
milage_recs= GROUP records ALL;
tot_miles = FOREACH milage_recs GENERATE SUM(records.Distance);
STORE tot_miles INTO 'totalmiles4';
更新:顺便说一句,我在datanode上使用了chmod go+rw(在我停止了namenode服务器和datanode服务器之后)。那也没有用。
5 月 30 日更新:更多细节。 我将 pig 脚本中 pig 脚本的父目录更改为:
records = LOAD '/user/der/1987.csv' USING PigStorage(',') AS
我有同样的错误。在客户端,这是错误。唯一的区别是失败的输入读取没有 hdfs://前缀。
Failed to read data from "/user/der/1987.csv"
Output(s):
Failed to produce result in "hdfs://localhost:9000/user/der/totalmiles4"
在我从我的 pig 脚本收到无效文件请求的那一刻,这里是服务器端的 namenode 日志。日志(使用 tail -f)滚动。这表明服务器正在接受对 pig 命令的请求。
2015-05-30 07:01:28,140 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
blk_1073741885_1061{UCState=UNDER_CONSTRUCTION,
truncateBlock=null,
primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-c84e0e37-2726-44da-af3e-67167c1010d1:NORMAL:127.0.0.1:50010|RBW]]}
size 0
2015-05-30 07:01:28,148 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile:
/tmp/temp-11418443/tmp85697770/automaton-1.11-8.jar
is closed by DFSClient_NONMAPREDUCE_-1939565577_1
我只需要获取 pig 脚本的源代码并检查它发出的完整 hdfs 命令。我认为我配置的 hadoop hdfs 服务有问题。
我是一名优秀的程序员,十分优秀!