gpt4 book ai didi

python - 如何读取 Pandas 中的大型 json?

转载 作者:太空狗 更新时间:2023-10-29 21:33:25 24 4
gpt4 key购买 nike

我的代码是:data_review=pd.read_json('review.json')
我有数据 review 如下:

{
// string, 22 character unique review id
"review_id": "zdSx_SD6obEhz9VrW9uAWA",

// string, 22 character unique user id, maps to the user in user.json
"user_id": "Ha3iJu77CxlrFm-vQRs_8g",

// string, 22 character business id, maps to business in business.json
"business_id": "tnhfDv5Il8EaGSXZGiuQGg",

// integer, star rating
"stars": 4,

// string, date formatted YYYY-MM-DD
"date": "2016-03-09",

// string, the review itself
"text": "Great place to hang out after work: the prices are decent, and the ambience is fun. It's a bit loud, but very lively. The staff is friendly, and the food is good. They have a good selection of drinks.",

// integer, number of useful votes received
"useful": 0,

// integer, number of funny votes received
"funny": 0,

// integer, number of cool votes received
"cool": 0
}

但是我得到了以下错误:

    333             fh, handles = _get_handle(filepath_or_buffer, 'r',
334 encoding=encoding)
--> 335 json = fh.read()
336 fh.close()
337 else:

OSError: [Errno 22] Invalid argument

我的json文件不包含任何注释和3.8G!我只是从这里下载文件来练习link

当我使用下面的代码时,抛出同样的错误:

import json
with open('review.json') as json_file:
data = json.load(json_file)

最佳答案

如果您不想使用 for 循环,以下应该可以解决问题:

import pandas as pd

df = pd.read_json("foo.json", lines=True)

这将处理您的 json 文件看起来与此类似的情况:

{"foo": "bar"}
{"foo": "baz"}
{"foo": "qux"}

并将其转换为由单列 foo 和三行组成的 DataFrame。

您可以在 Panda 的 docs 阅读更多内容

关于python - 如何读取 Pandas 中的大型 json?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46790390/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com