gpt4 book ai didi

python - GroupBY 频率计数 JSON 响应 - 嵌套字段

转载 作者:太空宇宙 更新时间:2023-11-03 14:18:04 25 4
gpt4 key购买 nike

我正在尝试聚合来自返回 JSON 对象的 API 调用的响应并获取一些频率计数。

我已成功对 JSON 响应中的一个字段执行此操作,但我想尝试相同操作的第二个字段不起作用

这两个字段都称为“category”,但不起作用的字段嵌套在“outcome_status”内。

我得到的错误是 KeyError: 'category'

以下代码使用公共(public)API,不需要身份验证,因此可以轻松测试。

import simplejson
import requests

#make a polygon for use in the API call
lat_coord = 51.767538
long_coord = -1.497488
lat_upper = str(lat_coord + 0.02)
lat_lower = str(lat_coord - 0.02)
long_upper = str(long_coord + 0.02)
long_lower = str(long_coord - 0.02)

#call from the API - no authentication required
api_call="https://data.police.uk/api/crimes-street/all-crime?poly=" + lat_lower + "," + long_upper + ":" + lat_lower + "," + long_lower + ":" + lat_upper + "," + long_lower + ":" + lat_upper + "," + long_upper + "&date=2017-01"
print (api_call)

request_resp=requests.get(api_call).json()

import pandas as pd
import numpy as np

df_resp = pd.DataFrame(request_resp)

#frequency counts for non-nested field (this works)
df_resp.groupby('category').context.count()

#next bit tries to do the nested (this doesn't work)

#tried dropping nulls
df_outcome = df_resp['outcome_status'].dropna()
print(df_outcome)

#tried index reset
df_outcome.reset_index()

#just errors
df_outcome.groupby('category').date.count()

最佳答案

如果您在“outcome_status”列中展开字典,我认为您将度过最轻松的时光:

代码:

outcome_status = [
{'outcome_status_' + k: v for k, v in z.items()} for z in (
dict(category=None, date=None) if x is None else x
for x in (y['outcome_status'] for y in request_resp)
)
]
df = pd.concat([df_resp.drop('outcome_status', axis=1),
pd.DataFrame(outcome_status)], axis=1)

这使用一些推导式来重命名 outcome_status 中的字段,方法是在键名称前添加 "outcome_status_" 并将它们转换为列。它还扩展了 None 值。

测试代码:

import requests
import pandas as pd

# make a polygon for use in the API call
lat_coord = 51.767538
long_coord = -1.497488
lat_upper = str(lat_coord + 0.02)
lat_lower = str(lat_coord - 0.02)
long_upper = str(long_coord + 0.02)
long_lower = str(long_coord - 0.02)

# call from the API - no authentication required
api_call = ("https://data.police.uk/api/crimes-street/all-crime?poly=" +
lat_lower + "," + long_upper + ":" +
lat_lower + "," + long_lower + ":" +
lat_upper + "," + long_lower + ":" +
lat_upper + "," + long_upper + "&date=2017-01")

request_resp = requests.get(api_call).json()
df_resp = pd.DataFrame(request_resp)

outcome_status = [
{'outcome_status_' + k: v for k, v in z.items()} for z in (
dict(category=None, date=None) if x is None else x
for x in (y['outcome_status'] for y in request_resp)
)
]
df = pd.concat([df_resp.drop('outcome_status', axis=1),
pd.DataFrame(outcome_status)], axis=1)

# just errors
print(df.groupby('outcome_status_category').category.count())

结果:

outcome_status_category
Court result unavailable 4
Investigation complete; no suspect identified 38
Local resolution 1
Offender given a caution 2
Offender given community sentence 3
Offender given conditional discharge 1
Offender given penalty notice 2
Status update unavailable 6
Suspect charged as part of another case 1
Unable to prosecute suspect 9
Name: category, dtype: int64

关于python - GroupBY 频率计数 JSON 响应 - 嵌套字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48127096/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com