gpt4 book ai didi

pyspark - Databricks PySpark 作业不断被取消

转载 作者:行者123 更新时间:2023-12-01 04:31:10 25 4
gpt4 key购买 nike

我在 Azure 上使用 Databricks 笔记本,我有一个完美的 Pyspark 笔记本,昨天一整天都运行良好。但是在一天结束时,我注意到我在之前知道可以正常工作的代码上遇到了一些奇怪的错误:org.apache.spark.SparkException: Job aborted due to stage failure: Task from application
但因为太晚了,我把它留到今天。今天我尝试创建一个新的集群并运行代码,这次它一直说我的工作被“取消”

事实上,我只是尝试运行 1 行代码:

filePath = "/SalesData.csv"

甚至被取消了。

编辑:

这是来自 Azure 的 std 错误日志:
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
/databricks/python/lib/python3.5/site-packages/IPython/config/loader.py:38: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
from IPython.utils.traitlets import HasTraits, List, Any, TraitError
Fri Jan 4 16:51:08 2019 py4j imported
Fri Jan 4 16:51:08 2019 Python shell started with PID 2543 and guid 86405138b8744987a1df085e4454bb5d
Could not launch process The 'config' trait of an IPythonShell instance must be a Config, but a value of class 'IPython.config.loader.Config' (i.e. {'HistoryManager': {'hist_file': ':memory:'}, 'HistoryAccessor': {'hist_file': ':memory:'}}) was specified. Traceback (most recent call last):
File "/tmp/1546620668035-0/PythonShell.py", line 1048, in <module>
launch_process()
File "/tmp/1546620668035-0/PythonShell.py", line 1036, in launch_process
console_buffer, error_buffer)
File "/tmp/1546620668035-0/PythonShell.py", line 508, in __init__
self.shell = self.create_shell()
File "/tmp/1546620668035-0/PythonShell.py", line 617, in create_shell
ip_shell = IPythonShell.instance(config=config, user_ns=user_ns)
File "/databricks/python/lib/python3.5/site-packages/traitlets/config/configurable.py", line 412, in instance
inst = cls(*args, **kwargs)
File "/databricks/python/lib/python3.5/site-packages/IPython/terminal/embed.py", line 159, in __init__
super(InteractiveShellEmbed,self).__init__(**kw)
File "/databricks/python/lib/python3.5/site-packages/IPython/terminal/interactiveshell.py", line 455, in __init__
super(TerminalInteractiveShell, self).__init__(*args, **kwargs)
File "/databricks/python/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 622, in __init__
super(InteractiveShell, self).__init__(**kwargs)
File "/databricks/python/lib/python3.5/site-packages/traitlets/config/configurable.py", line 84, in __init__
self.config = config
File "/databricks/python/lib/python3.5/site-packages/traitlets/traitlets.py", line 583, in __set__
self.set(obj, value)
File "/databricks/python/lib/python3.5/site-packages/traitlets/traitlets.py", line 557, in set
new_value = self._validate(obj, value)
File "/databricks/python/lib/python3.5/site-packages/traitlets/traitlets.py", line 589, in _validate
value = self.validate(obj, value)
File "/databricks/python/lib/python3.5/site-packages/traitlets/traitlets.py", line 1681, in validate
self.error(obj, value)
File "/databricks/python/lib/python3.5/site-packages/traitlets/traitlets.py", line 1528, in error
raise TraitError(e)
traitlets.traitlets.TraitError: The 'config' trait of an IPythonShell instance must be a Config, but a value of class 'IPython.config.loader.Config' (i.e. {'HistoryManager': {'hist_file': ':memory:'}, 'HistoryAccessor': {'hist_file': ':memory:'}}) was specified.

最佳答案

我和我的团队在安装 azureml['notebooks'] 后遇到了这个问题。 Python 打包到我们的集群中。安装似乎有效,但我们在尝试运行代码单元时收到“已取消”消息。

我们还在日志中收到了一个类似于这篇文章中的错误:

The 'config' trait of an IPythonShell instance must be a Config, 
but a value of class 'IPython.config.loader.Config'...

似乎某些 Python 包可能与此 Config 对象冲突,或者不兼容。我们卸载了库,重新启动了集群,一切正常。希望这对某人有所帮助:)

关于pyspark - Databricks PySpark 作业不断被取消,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54021634/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com