gpt4 book ai didi

python - groupby pandas 数据帧的相同部分字符串

转载 作者:行者123 更新时间:2023-11-30 22:31:13 24 4
gpt4 key购买 nike

我一直在使用 pandas 将 JSON 数据导出到 csv 文件。现在,我被要求对这些数据进行分组,并获取按系统分组的每个日期的总和。下面是我的 DataFrame 的示例。

数据框:

system,totalCapacity,totalLocatedCapacity,availableCapacity,date
aadata02,96155472,99924183,39116616,20170728
aadata02,41943174,41614541,15946266,20170728
aadata03,52764600,50966017,13839882,20170728
aadata03,52764600,44043720,15503376,20170728
aadata03,37373700,35654440,7073598,20170728
...
bbdata01,38473680,25168248,24006696,20170728
bbdata01,17585400,14681478,11711826,20170728
bbdata01,22015224,6907992,20668746,20170728

我从以下代码开始:

import pandas as pd

csvin = "test.csv"
csvout = "test2.csv"

df = pd.read_csv(csvin)
col = ['site', 'totalPoolCapacity', 'totalLocatedCapacity',
'availableVolumeCapacity', 'date']
df = df.groupby('date', as_index=False).sum()
df['site'] = pd.Series('aa', index=df.index)
with open(csvout, 'w+') as f:
df.to_csv(f, index=False, header=True, columns=col)

这仅按日期对所有内容进行求和,并将所有内容放在站点 aa 下。如何修改我的代码以使其按以下方式输出:

site,totalCapacity,totalLocatedCapacity,availableCapacity,date
aa,903240114,735713005,421348788,20170728
bb,78074304,46757718,56387268,20170728

最佳答案

你在找吗

df.groupby([df.system.str[:2], 'date']).sum().reset_index()

system date totalCapacity totalLocatedCapacity availableCapacity
0 aa 20170728 281001546 272202901 91479738
1 bb 20170728 78074304 46757718 56387268

关于python - groupby pandas 数据帧的相同部分字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45871191/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com