gpt4 book ai didi

python - PySpark - UnpicklingError : NEWOBJ class argument has NULL tp_new

转载 作者:行者123 更新时间:2023-11-28 17:15:25 25 4
gpt4 key购买 nike

当我执行波纹管时,出现 Unpickling 错误

rdd = sc.parallelize([('HOMICIDE', {'2017': 1}), 
('DECEPTIVE PRACTICE', {'2015': 2, '2017': 2, '2016': 8}),
('ROBBERY', {'2016': 2})])

rdd.flatMapValues(dict.items).collect()

错误如下,在dictionay值上使用flatMapValues有什么问题吗

  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/worker.py", line 98, in main
command = pickleSer._read_with_length(infile)
File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 164, in _read_with_length
return self.loads(obj)
File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 422, in loads
return pickle.loads(obj)
UnpicklingError: NEWOBJ class argument has NULL tp_new
) [duplicate 3]
17/06/06 17:01:14 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool

最佳答案

rdd = sc.parallelize([('HOMICIDE', {'2017': 1}), 
('DECEPTIVE PRACTICE', {'2015': 2, '2017': 2, '2016': 8}),
('ROBBERY', {'2016': 2})])

rdd.flatMapValues(lambda data: data.items()).collect()

[('HOMICIDE', ('2017', 1)),
('DECEPTIVE PRACTICE', ('2015', 2)),
('DECEPTIVE PRACTICE', ('2017', 2)),
('DECEPTIVE PRACTICE', ('2016', 8)),
('ROBBERY', ('2016', 2))]

dict.items 是方法描述符。您必须提供一个函数来通知 flatmap 如何解压缩这些值。我通过将 labmda 函数传递给 flatMap 函数来做到这一点。

关于python - PySpark - UnpicklingError : NEWOBJ class argument has NULL tp_new,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44395974/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com