gpt4 book ai didi

scala - 在 Yarn 集群上提交 Spark 作业

转载 作者:可可西里 更新时间:2023-11-01 14:20:22 28 4
gpt4 key购买 nike

我现在已经为以下问题苦苦挣扎了 2 天多。

我用 Scala 编写了一个基本的“HelloWorld”脚本:

object Hello extends App{
println("WELCOME TO A FIRST TEST WITH SCALA COMPILED WITH SBT counting fr. 1:15 with sleep 1")
val data = 1 to 15

for( a <- data ){
println( "Value of a: " + a )
Thread sleep 1000
}

然后我用 SBT 编译以获得 JAR 编译版本。

然后我使用 HDP 2.2.4.2 将所有内容传输到集群(这是在虚拟 Linux 机器上运行的 Horthonworks 沙箱)。

我实际上能够使用 yarn-client 在集群上使用以下命令运行该作业:

spark-submit --verbose --master yarn-client --class Hello SCALA/hello.jar

但是,当尝试使用以下命令在 yarn-cluster 上提交相同的 helloWorld 作业时

spark-submit --verbose --master yarn-cluster--class Hello SCALA/hello.jar

作业首先正常运行(输出是预期的,并且退出 0)但随后作业停止并显示以下内容:

15/06/05 15:52:09 INFO Client: Application report for application_1433491352951_0010 (state: FAILED)

15/06/05 15:52:09 INFO Client:
client token: N/A
diagnostics: Application application_1433491352951_0010 failed 2 times due to AM Container for appattempt_1433491352951_0010_000002 exited with exitCode: 0
For more detailed output, check application tracking page:http://sandbox.hortonworks.com:8088/proxy/application_1433491352951_0010/Then, click on links to logs of each attempt.
Diagnostics: Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1433519471297
final status: FAILED
tracking URL: http://sandbox.hortonworks.com:8088/cluster/app/application_1433491352951_0010
user: root
Error: application failed with exception
org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:522)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:35)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:139)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:367)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:77)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

然后我决定使用以下命令行检查日志:

yarn logs -applicationId application_1433491352951_00010

然后我得到:

15/06/05 15:56:33 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
15/06/05 15:56:33 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/192.168.182.129:8050
15/06/05 15:56:35 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
15/06/05 15:56:35 INFO compress.CodecPool: Got brand-new decompressor [.deflate]


Container: container_e08_1433491352951_0010_01_000001 on sandbox.hortonworks.com_45454
========================================================================================
LogType:stderr
Log Upload Time:Fri Jun 05 15:52:10 +0000 2015
LogLength:2050
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/root/filecache/28/spark-assembly-1.2.1.2.2.4.2-2-hadoop2.6.0.2.2.4.2-2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/root/filecache/29/hello.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/06/05 15:51:18 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
15/06/05 15:51:20 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1433491352951_0010_000001
15/06/05 15:51:21 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/06/05 15:51:21 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/06/05 15:51:21 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/06/05 15:51:21 INFO yarn.ApplicationMaster: Starting the user JAR in a separate Thread
15/06/05 15:51:21 INFO yarn.ApplicationMaster: Waiting for spark context initialization
15/06/05 15:51:21 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 0
15/06/05 15:51:31 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 1
15/06/05 15:51:36 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
15/06/05 15:51:41 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
15/06/05 15:51:41 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
15/06/05 15:51:41 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1433491352951_0010

LogType:stdout
Log Upload Time:Fri Jun 05 15:52:10 +0000 2015
LogLength:300
Log Contents:
WELCOME TO A FIRST TEST WITH SCALA COMPILED WITH SBT counting fr. 1:15 with sleep 1
Value of a: 1
Value of a: 2
Value of a: 3
Value of a: 4
Value of a: 5
Value of a: 6
Value of a: 7
Value of a: 8
Value of a: 9
Value of a: 10
Value of a: 11
Value of a: 12
Value of a: 13
Value of a: 14
Value of a: 15



Container: container_e08_1433491352951_0010_02_000001 on sandbox.hortonworks.com_45454
========================================================================================
LogType:stderr
Log Upload Time:Fri Jun 05 15:52:10 +0000 2015
LogLength:2050
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/root/filecache/28/spark-assembly-1.2.1.2.2.4.2-2-hadoop2.6.0.2.2.4.2-2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/root/filecache/29/hello.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/06/05 15:51:45 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
15/06/05 15:51:47 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1433491352951_0010_000002
15/06/05 15:51:48 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/06/05 15:51:48 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/06/05 15:51:48 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/06/05 15:51:48 INFO yarn.ApplicationMaster: Starting the user JAR in a separate Thread
15/06/05 15:51:48 INFO yarn.ApplicationMaster: Waiting for spark context initialization
15/06/05 15:51:48 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 0
15/06/05 15:51:58 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 1
15/06/05 15:52:03 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
15/06/05 15:52:08 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
15/06/05 15:52:08 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
15/06/05 15:52:08 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1433491352951_0010

LogType:stdout
Log Upload Time:Fri Jun 05 15:52:10 +0000 2015
LogLength:300
Log Contents:
WELCOME TO A FIRST TEST WITH SCALA COMPILED WITH SBT counting fr. 1:15 with sleep 1
Value of a: 1
Value of a: 2
Value of a: 3
Value of a: 4
Value of a: 5
Value of a: 6
Value of a: 7
Value of a: 8
Value of a: 9
Value of a: 10
Value of a: 11
Value of a: 12
Value of a: 13
Value of a: 14
Value of a: 15

我接受了某人建议的HelloWorld 项目,重新编译并再次尝试。现在我遇到了另一个问题:当我使用以下命令提交任务时:

spark-submit --verbose --master yarn-cluster SCALA/hello.jar

我得到了以下无穷无尽的评论:

15/06/08 16:42:35 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

我不太明白,因为看起来服务器没有响应,而程序应该从沙箱在 Hadoop 集群上运行。

最佳答案

在我的例子中,我使用了:

val config = new SparkConf()
config.setMaster("local[*]")

并使用以下方式提交作业:

spark-submit --master yarn-cluster ..

从我的代码中删除 config.setMaster 后,问题就解决了。

关于scala - 在 Yarn 集群上提交 Spark 作业,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30670933/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com