gpt4 book ai didi

apache-spark - 无法在 yarn 簇上运行 Spark 作业:连接失败异常

转载 作者:行者123 更新时间:2023-12-02 20:36:00 26 4
gpt4 key购买 nike

我正在YARN集群上运行一个简单的Spark作业,并且我对yarn-site.xml进行了配置

<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>

<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>

<property>
<name>yarn.log.server.url</name>
<value>http://localhost:19888/jobhistory/logs</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</value>
</property>

<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/data/yarn/local</value>
</property> <property>
<name>yarn.nodemanager.log-aggregation.compression-type</name>
<value>gz</value>
</property>

<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/data/yarn/log</value>
</property>

<property>
<name>yarn.nodemanager.log.retain-second</name>
<value>604800</value>
</property>

<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
</property>

<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>

<property>
<name>yarn.nodemanager.address</name>
<value>localhost:45454</value>
</property>

<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8050</value>
</property> <property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:8088</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

<property>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.timeline-service.address</name>
<value>localhost:10200</value>
</property>

<property>
<name>yarn.timeline-service.enabled</name>
<value>false</value>
</property>

<property>
<name>yarn.timeline-service.generic-application-history.store-class</name>
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.NullApplicationHistoryStore</value>
</property>

<property>
<name>yarn.timeline-service.leveldb-timeline-store.path</name>
<value>/data/yarn/timeline</value>
</property> <property>
<name>yarn.timeline-service.store-class</name>
<value>org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore</value>
</property>

<property>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
</property>

<property>
<name>yarn.timeline-service.ttl-ms</name>
<value>604800000</value>
</property>

<property>
<name>yarn.timeline-service.webapp.address</name>
<value>localhost:8188</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>

<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property> </configuration>


错误来了
Exception in thread "main" java.net.ConnectException: Call From abhijeet.local/192.168.1.13 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.GeneratedConstructorAccessor4.newInstance(Unknown Source)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
27-06-2018 03:32:33 PDT Spark123 INFO - at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1479)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1412)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
27-06-2018 03:32:33 PDT Spark123 INFO - at com.sun.proxy.$Proxy8.getNewApplication(Unknown Source)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:221)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
27-06-2018 03:32:33 PDT Spark123 INFO - at java.lang.reflect.Method.invoke(Method.java:498)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
27-06-2018 03:32:33 PDT Spark123 INFO - at com.sun.proxy.$Proxy9.getNewApplication(Unknown Source)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:219)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:227)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:159)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1168)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.yarn.Client.main(Client.scala)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
27-06-2018 03:32:33 PDT Spark123 INFO - at java.lang.reflect.Method.invoke(Method.java:498)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
27-06-2018 03:32:33 PDT Spark123 INFO - Caused by: java.net.ConnectException: Connection refused
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
27-06-2018 03:32:33 PDT Spark123 INFO - at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
27-06-2018 03:32:33 PDT Spark123 INFO - at org.apache.hadoop.ipc.Client.call(Client.java:1451)
27-06-2018 03:32:33 PDT Spark123 INFO - ... 25 more
27-06-2018 03:32:33 PDT Spark123 INFO - Process completed unsuccessfully in 1214 seconds.
27-06-2018 03:32:33 PDT Spark123 ERROR - Job run failed!

让我知道您是否还需要其他任何内容。

我不了解这个问题,因此无法解决。

我的Spark代码
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.log4j._
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.SparkContext._
import org.apache.spark.sql.functions._
import org.apache.hadoop.fs.Path
import org.apache.spark.sql.types._

object Test {
def main(args: Array[String]){)
// Setting log level to errors
Logger.getLogger("org").setLevel(Level.ERROR)

// Setting up the sparksession
val spark = SparkSession.builder
.appName("Journaling")
.config("spark.master","yarn")
.getOrCreate()
val sc = SparkContext.getOrCreate()

// Configuration for reading file from HDFS
val conf = new Configuration()
conf.set("fs.defaultFS", "hdfs://localhost:8020")
val fs= FileSystem.get(conf)
val df = spark.read.format("csv")
.option("inferschema", "true")
.option("header", "true")
.load("hdfs://localhost:8020/fakefriends.csv")

df.show()
}
}

上面的代码只是从HDFS中提取一个csv文件,并从中创建一个数据帧并显示前20个内容。

我可以在本地客户端模式下运行相同的命令。

最佳答案

尝试将所有本地主机更改为您的本地IP地址(192.168.1.13),以群集模式运行时spark无法解析0.0.0.0

关于apache-spark - 无法在 yarn 簇上运行 Spark 作业:连接失败异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51077469/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com