gpt4 book ai didi

python - 使用Python的spark-on-k8s资源登台服务器

转载 作者:行者123 更新时间:2023-12-02 11:49:02 26 4
gpt4 key购买 nike

我一直在使用带有spark-on-k8s的Running Spark on Kubernetes docs,kubernetes v1.9.0和Minikube v0.25.0的v2.2.0-kubernetes-0.5.0

我可以使用以下命令成功运行Python作业:

bin/spark-submit \
--deploy-mode cluster \
--master k8s://https://10.128.0.4:8443 \
--kubernetes-namespace default \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
--jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
local:///opt/spark/examples/src/main/python/pi.py 10

我可以使用以下命令成功运行具有本地依赖项的Java作业(在设置资源登台服务器之后):
bin/spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--master k8s://https://10.128.0.4:8443 \
--kubernetes-namespace default \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.resourceStagingServer.uri=http://10.128.0.4:31000 \
./examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar

是否可以运行具有本地依赖项的Python作业?我尝试了此命令,但失败了:
bin/spark-submit \
--deploy-mode cluster \
--master k8s://https://10.128.0.4:8443 \
--kubernetes-namespace default \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.resourceStagingServer.uri=http://10.128.0.4:31000 \
./examples/src/main/python/pi.py 10

我在驱动程序的日志中得到此错误:
Error: Could not find or load main class .opt.spark.jars.RoaringBitmap-0.5.11.jar

这些错误在事件日志中:
MountVolume.SetUp failed for volume "spark-init-properties" : configmaps "spark-pi-1518224354203-init-config" not found
...
MountVolume.SetUp failed for volume "spark-init-secret" : secrets "spark-pi-1518224354203-init-secret" not found

最佳答案

解决方法是通过--jars提供示例jar作为依赖项:

bin/spark-submit \
--deploy-mode cluster \
--master k8s://https://10.128.0.4:8443 \
--kubernetes-namespace default \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.resourceStagingServer.uri=http://10.128.0.4:31000 \
--jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
./examples/src/main/python/pi.py 10

我不确定为什么这样工作( RoaringBitmap-0.5.11.jar应该存在于 /opt/spark/jars中,无论如何都应添加到类路径中),但这暂时解决了我的问题。

关于python - 使用Python的spark-on-k8s资源登台服务器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48716630/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com