gpt4 book ai didi

python - Pandas:如何对多级 DataFrame 中子级的最大 3 个值求和

转载 作者:太空宇宙 更新时间:2023-11-04 11:16:14 25 4
gpt4 key购买 nike

我有如下所示的数据框。对它进行排序,以便“POP”相对于每个“STATE”按降序排列。现在我想对每个“STATE”的“POP”的三个最大值求和,我应该怎么做?

import pandas as pd
d = [['X','q',123383],['X','w',43857349],['X','e',236657],['X','r',23574594],
['Y','t',547853],['Y','y',46282134],['Y','u',43857439],['Y','i',32654893],['Y','i',95678312]]
df = pd.DataFrame(d, columns = ['STATE','COUNTY','POP'])
df.sort_values(['STATE','POP'], ascending=[True, False]).set_index(['STATE','COUNTY'])

print(sorted_df)

# sorted_df:
POP
STATE COUNTY
X w 43857349
r 23574594
e 236657
q 123383
Y i 95678312
y 46282134
u 43857439
i 32654893
t 547853

最佳答案

nlargest 不需要预排序:

df.groupby(['STATE']).POP.nlargest(3)

给你

STATE   
X 1 43857349
3 23574594
2 236657
Y 8 95678312
5 46282134
6 43857439
Name: POP, dtype: int64

如果你只关心总和:

df.groupby(['STATE']).POP.nlargest(3).sum(level=0)

给出:

STATE
X 67668600
Y 185817885
Name: POP, dtype: int64

关于python - Pandas:如何对多级 DataFrame 中子级的最大 3 个值求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56943216/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com