gpt4 book ai didi

java - Scala、SparkLauncher无法运行程序 "/etc/spark/conf.cloudera.CD-SPARK_ON_YARN-brkvSOzr/yarn-conf/topology.py"

转载 作者:行者123 更新时间:2023-11-30 06:53:05 24 4
gpt4 key购买 nike

下面的代码被构建为jar,并通过putty使用sparkSubmit命令执行。效果很好。

var conf = new SparkConf().setAppName("ABC")

val sc = new SparkContext(conf)

var hiveContext = new HiveContext(sc)

import sqlContext.implicits._

sqlContext.sql("query")

但是当我通过 SparkLauncher 运行相同的代码时,它会在下面抛出一个错误,主 - yarn 簇 Spark - 1.6

java.io.IOException: Cannot run program "/etc/spark/conf.cloudera.CD-SPARK_ON_YARN-brkvSOzr/yarn-conf/topology.py" (in directory "/data4/yarn/nm/usercache/ppmingusrdev/appcache/application_14823231312_123/container_14866508534534_144_01_000004"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:548)
at org.apache.hadoop.util.Shell.run(Shell.java:504)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251)
at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188)
at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119)
at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101)
at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:81)
at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$handleAllocatedContainers$2.apply(YarnAllocator.scala:337)
at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$handleAllocatedContainers$2.apply(YarnAllocator.scala:336)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.deploy.yarn.YarnAllocator.handleAllocatedContainers(YarnAllocator.scala:336)
at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:236)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:368)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)

最佳答案

最近将 Cloudera Manager 升级到 5.10 后,我们今天遇到了同样的错误。在我们的例子中,这是因为此版本的 CM 中存在错误。

这意味着运行应用程序主节点的工作节点(如果处于yarn-client模式,则不是运行驱动程序的边缘节点)没有Spark Gateway角色,因此没有spark-on-yarn conf目录。

我们的解决方法是为每个节点赋予 Spark Gateway 角色并重新部署客户端配置。

顺便说一句,您的作业应该仍然运行,但数据局部性减少或没有数据局部性(因此可能慢得多)。

关于java - Scala、SparkLauncher无法运行程序 "/etc/spark/conf.cloudera.CD-SPARK_ON_YARN-brkvSOzr/yarn-conf/topology.py",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42368447/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com