gpt4 book ai didi

apache-spark - 为什么 Spark on YARN 在集群模式下失败并显示 "Exception in thread "驱动程序“java.lang.NullPointerException”?

转载 作者:行者123 更新时间:2023-12-04 22:40:03 25 4
gpt4 key购买 nike

我将 emr-5.4.0 与 Spark 2.1.0 一起使用。我懂什么NullPointerException是,这个问题是关于为什么在这种特殊情况下抛出的。

无法真正弄清楚为什么我在驱动程序线程中得到 NullPointerException。

我的这个奇怪的工作因这个错误而失败:

18/03/29 20:07:52 INFO ApplicationMaster: Starting the user application in a separate Thread
18/03/29 20:07:52 INFO ApplicationMaster: Waiting for spark context initialization...
Exception in thread "Driver" java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
18/03/29 20:07:52 ERROR ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: SparkContext is null but app is still running!
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:415)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:766)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:764)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/03/29 20:07:52 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Deleting staging directory hdfs://<ip-address>.ec2.internal:8020/user/hadoop/.sparkStaging/application_1522348295743_0010
18/03/29 20:07:52 INFO ShutdownHookManager: Shutdown hook called
End of LogType:stderr

我提交了这份工作:
spark-submit --deploy-mode cluster --master yarn --num-executors 40 --executor-cores 16 --executor-memory 100g --driver-cores 8 --driver-memory 100g --class <package.class_name> --jars <s3://s3_path/some_lib.jar> <s3://s3_path/class.jar>

我的类(class)看起来像这样:
class MyClass {

def main(args: Array[String]): Unit = {
val c = new MyClass()
c.process()
}

def process(): Unit = {
val sparkConf = new SparkConf().setAppName("my-test")
val sparkSession: SparkSession = SparkSession.builder().config(sparkConf).getOrCreate()
import sparkSession.implicits._
....
}

...
}

最佳答案

更改 class MyClassobject MyClass你就完成了。

当我们在做的时候,我也会改变 class MyClassobject MyClass extends App并删除 def main(args: Array[String]): Unit (由 extends App 给出)。

我报告了 Spark 2.3.0 的改进 - [SPARK-23830] Spark on YARN in cluster deploy mode fail with NullPointerException when a Spark application is a Scala class not object - 将它很好地报告给最终用户。

深入了解 Spark on YARN 的工作原理,以下消息是 ApplicationMaster of a Spark application starts the driver (您使用 --deploy-mode cluster --master yarnspark-submit )。

ApplicationMaster: Starting the user application in a separate Thread



在 INFO 消息之后,您应该会看到另一个:

ApplicationMaster: Waiting for spark context initialization...



这是 driver initialization when the ApplicationMaster runs 的一部分.

异常原因 Exception in thread "Driver" java.lang.NullPointerException是由于 following code :
val mainMethod = userClassLoader.loadClass(args.userClass)
.getMethod("main", classOf[Array[String]])

我的理解是 mainMethodnull至此 following line (其中 mainMethodnull )“触发器” NullPointerException :
mainMethod.invoke(null, userArgs.toArray)

该线程确实被称为 Driver (如在 Exception in thread "Driver" java.lang.NullPointerException 中)如在 this line 中设置:
userThread.setContextClassLoader(userClassLoader)
userThread.setName("Driver")
userThread.start()

行号不同,因为我使用 Spark 2.3.0 来引用行,而您将 emr-5.4.0 与 Spark 2.1.0 一起使用。

关于apache-spark - 为什么 Spark on YARN 在集群模式下失败并显示 "Exception in thread "驱动程序“java.lang.NullPointerException”?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49564334/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com