gpt4 book ai didi

json - 难以将 JSON 转换为 Spark 数据帧

转载 作者:行者123 更新时间:2023-12-04 07:52:08 25 4
gpt4 key购买 nike

我一直在尝试将 JSON 加载到 pyspark 数据帧中,但我在这里有点挣扎。
这是我迄今为止尝试过的(有和没有多行):

import json
newJson = json.dumps(testjson)
newdf = spark.read.json(sc.parallelize([newJson]))
newdf.display()
JSON 文件:
testjson = [
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
('{"id":434, "address" : ["432.432.432.432", "432.432.432.432", "432.432.432.432", "432.432.432.432"]}',),
]
尝试显示数据框时,我收到“corrupt_record”。我究竟做错了什么?

最佳答案

尝试将其转换为字符串列表。 Spark 无法理解字符串元组列表。还有 json.dumps是不必要的,因为 Spark 应该能够理解您的 json 输入。

df = spark.read.json(sc.parallelize([i[0] for i in testjson]))

df.show(truncate=False)
+--------------------------------------------------------------------+---+
|address |id |
+--------------------------------------------------------------------+---+
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
|[432.432.432.432, 432.432.432.432, 432.432.432.432, 432.432.432.432]|434|
+--------------------------------------------------------------------+---+

关于json - 难以将 JSON 转换为 Spark 数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66905474/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com