gpt4 book ai didi

azure - hadoop集群,datanode无法运行,0个节点被排除

转载 作者:可可西里 更新时间:2023-11-01 15:23:55 24 4
gpt4 key购买 nike

我正在Azure中的Hadoop中创建一个多节点(1主3从)集群,我认为所有设置都已完成,但是当我运行测试文件时,它遇到了与Stackoverflow中其他人类似的问题,< strong>我已经尝试过他们的解决方案,但是,这个问题仍然无法解决。谁能帮助我,我已经被这个问题困扰了几天

 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /benchmarks/TestDFSIO/io_control/in_file_test_io_0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1797)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2559)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:846)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:510)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:868)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:814)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2603)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1349)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:444)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1845)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1645)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:710)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /benchmarks/TestDFSIO/io_control/in_file_test_io_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1797)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2559)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:846)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:510)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:868)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:814)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2603)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1349)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:444)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1845)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1645)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:710)


2018-04-13 01:25:24,359 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=10.0.0.7, hostname=10.0.0.7)
2018-04-13 01:25:24,359 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call Call#3 Retry#0 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 10.0.0.7:59340

我检查hadoop-hduser-namenode-master.log并得到一些错误,似乎Damenode无法与Namenode连接。

org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=10.0.0.7, hostname=10.0.0.7): 
DatanodeRegistration(0.0.0.0:50010, datanodeUuid=5b728ca4-01d0-4540-b216-e31c01024b49, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-e37e460a-966c-4067-8104-393b2c008d4a;nsid=1665807189;c=1523582587978)

但是我已经将所有slave的私钥添加到master中,并修改了/etc/hostname/etc/hosts,我还修改了master中的slave(slave1,slave2,slave3)和masters(master)文件。

#127.0.0.1 localhost
master 10.0.0.4
slave1 10.0.0.5
slave2 10.0.0.6
slave3 10.0.0.7
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

我的 core-site.xml

core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

yarn 站点.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services </name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>

hdfs-site.xml(主站使用namenode,从站使用datanode)

<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
</property>
</configuration>

mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

此外,我已经检查过状态

Filesystem      Size  Used Avail Use% Mounted on
udev 944M 0 944M 0% /dev
tmpfs 191M 3.1M 188M 2% /run
/dev/sda1 30G 2.8G 27G 10% /
tmpfs 954M 0 954M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 954M 0 954M 0% /sys/fs/cgroup
/dev/sdb1 3.9G 8.0M 3.7G 1% /mnt
tmpfs 191M 0 191M 0% /run/user/1000

最后,我使用JPS检查每个节点(当start-all.sh时):

从机2

7175 Jps

大师

11961 Jps

他们是不正常的。我该怎么办?

最佳答案

检查您的/etc/hosts。应该是:

10.0.0.4 master 
10.0.0.5 slave1
10.0.0.6 slave2
10.0.0.7 slave3

关于azure - hadoop集群,datanode无法运行,0个节点被排除,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49808490/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com