gpt4 book ai didi

python - 如何从字典列表中提取数据到 Pandas 数据框中?

转载 作者:行者123 更新时间:2023-12-04 08:54:38 25 4
gpt4 key购买 nike

这是我在使用 Telethon API 运行 python 脚本后作为输出获得的 json 文件的一部分。

[{"_": "Message", "id": 4589, "to_id": {"_": "PeerChannel", "channel_id": 1399858792}, "date": "2020-09-03T14:51:03+00:00", "message": "Looking for product managers / engineers who have worked in search engine / query understanding space. Please PM me if you can connect me to someone for the same", "out": false, "mentioned": false, "media_unread": false, "silent": false, "post": false, "from_scheduled": false, "legacy": false, "edit_hide": false, "from_id": 356886523, "fwd_from": null, "via_bot_id": null, "reply_to_msg_id": null, "media": null, "reply_markup": null, "entities": [], "views": null, "edit_date": null, "post_author": null, "grouped_id": null, "restriction_reason": []}, {"_": "MessageService", "id": 4588, "to_id": {"_": "PeerChannel", "channel_id": 1399858792}, "date": "2020-09-03T11:48:18+00:00", "action": {"_": "MessageActionChatJoinedByLink", "inviter_id": 310378430}, "out": false, "mentioned": false, "media_unread": false, "silent": false, "post": false, "legacy": false, "from_id": 1264437394, "reply_to_msg_id": null}

如您所见,python 脚本从电报中的特定 channel 抓取了聊天记录。我所需要的只是将 json 的日期和消息部分存储到一个单独的数据帧中,以便我可以应用适当的过滤器并提供适当的输出。谁能帮我这个?

最佳答案

  • 这假设从 API 返回的对象不是字符串(例如 '[{...}, {...}]'
  • 如果是字符串,使用 data = json.loads(data) , 第一的。

  • 'date'和相应的 'message'可以从 list 中提取的 dicts具有列表理解。
  • 遍历每个 dictlist ,并使用 dict.getkey .如果 key 不存在,None被退回。

  • import pandas as pd

    # where data is the list of dicts, unpack the desired keys and load into pandas
    df = pd.DataFrame([{'date': i.get('date'), 'message': i.get('message')} for i in data])

    # display(df)
    date message
    0 2020-09-03T14:51:03+00:00 Looking for product managers / engineers who have worked in search engine / query understanding space. Please PM me if you can connect me to someone for the same
    1 2020-09-03T11:48:18+00:00 None
    或者
  • 如果你想跳过数据,这里'message'None

  • df = pd.DataFrame([{'date': i['date'], 'message': i['message']} for i in data if i.get('message')])

    date message
    2020-09-03T14:51:03+00:00 Looking for product managers / engineers who have worked in search engine / query understanding space. Please PM me if you can connect me to someone for the same

    关于python - 如何从字典列表中提取数据到 Pandas 数据框中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63908851/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com