gpt4 book ai didi

python - 作业完成后 spark-submit 继续挂起

转载 作者:太空狗 更新时间:2023-10-29 21:11:56 25 4
gpt4 key购买 nike

我正在尝试在 AWS 中使用 hdfs 测试 spark 1.6。我正在使用示例文件夹中可用的 wordcount python 示例。我使用 spark-submit 提交作业,作业成功完成,并且也在控制台上打印结果。 Web 用户界面还表示已完成。然而, Spark 提交永远不会终止。我已经验证上下文在字数统计示例代码中也已停止。

有什么问题吗?

这是我在控制台上看到的。

6-05-24 14:58:04,749 INFO  [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
2016-05-24 14:58:04,749 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/stages/json,null}
2016-05-24 14:58:04,749 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/stages,null}
2016-05-24 14:58:04,749 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
2016-05-24 14:58:04,750 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
2016-05-24 14:58:04,750 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
2016-05-24 14:58:04,750 INFO [Thread-3] handler.ContextHandler (ContextHandler.java:doStop(843)) - stopped o.s.j.s.ServletContextHandler{/jobs,null}
2016-05-24 14:58:04,802 INFO [Thread-3] ui.SparkUI (Logging.scala:logInfo(58)) - Stopped Spark web UI at http://172.30.2.239:4040
2016-05-24 14:58:04,805 INFO [Thread-3] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(58)) - Shutting down all executors
2016-05-24 14:58:04,805 INFO [dispatcher-event-loop-2] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(58)) - Asking each executor to shut down
2016-05-24 14:58:04,814 INFO [dispatcher-event-loop-5] spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(58)) - MapOutputTrackerMasterEndpoint stopped!
2016-05-24 14:58:04,818 INFO [Thread-3] storage.MemoryStore (Logging.scala:logInfo(58)) - MemoryStore cleared
2016-05-24 14:58:04,818 INFO [Thread-3] storage.BlockManager (Logging.scala:logInfo(58)) - BlockManager stopped
2016-05-24 14:58:04,820 INFO [Thread-3] storage.BlockManagerMaster (Logging.scala:logInfo(58)) - BlockManagerMaster stopped
2016-05-24 14:58:04,821 INFO [dispatcher-event-loop-3] scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(58)) - OutputCommitCoordinator stopped!
2016-05-24 14:58:04,824 INFO [Thread-3] spark.SparkContext (Logging.scala:logInfo(58)) - Successfully stopped SparkContext
2016-05-24 14:58:04,827 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-2] remote.RemoteActorRefProvider$RemotingTerminator (Slf4jLogger.scala:apply$mcV$sp(74)) - Shutting down remote daemon.
2016-05-24 14:58:04,828 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-2] remote.RemoteActorRefProvider$RemotingTerminator (Slf4jLogger.scala:apply$mcV$sp(74)) - Remote daemon shut down; proceeding with flushing remote transports.
2016-05-24 14:58:04,843 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-2] remote.RemoteActorRefProvider$RemotingTerminator (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting shut down.

我必须按 ctrl-c 来终止 spark-submit 进程。这真是一个奇怪的问题,我不知道如何解决这个问题。如果有任何我应该查看的日志或在这里做不同的事情,请告诉我。

这是 spark-submit 过程的 jstack 输出的 pastebin 链接: http://pastebin.com/Nfnt4XmT

最佳答案

我的 spark 作业代码中的自定义线程池也有同样的问题。我发现 spark-submit 在您的代码中使用自定义 守护进程线程池时挂起。你可以查看 ThreadUtils.newDaemonCachedThreadPool()了解 spark 开发人员如何创建线程池,或者您可以使用此实用程序,但要小心,因为它们是私有(private)包。

关于python - 作业完成后 spark-submit 继续挂起,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37421852/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com