gpt4 book ai didi

python - 列出 Pandas 组中最常见的成员?

转载 作者:太空宇宙 更新时间:2023-11-04 03:07:03 24 4
gpt4 key购买 nike

我有一个包含这样列的数据框:

        id                           lead_sponsor lead_sponsor_class
02837692 Janssen Research & Development, LLC Industry
02837679 Aarhus University Hospital Other
02837666 Universidad Autonoma de Ciudad Juarez Other
02837653 Universidad Autonoma de Madrid Other
02837640 Beirut Eye Specialist Hospital Other

我想找到最常见的主要发起人。我可以使用以下方法列出每个组的大小:

df.groupby(['lead_sponsor', 'lead_sponsor_class']).size()

这给了我这个:

lead_sponsor                              lead_sponsor_class
307 Hospital of PLA Other 1
3E Therapeutics Corporation Industry 1
3M Industry 4
4SC AG Industry 8
5 Santé Other 1

但是如何找到前 10 个最常见的组?如果我这样做:

df.groupby(['lead_sponsor', 'lead_sponsor_class']).size().sort_values(ascending=False).head(10) 

然后我得到一个错误:

AttributeError: 'Series' object has no attribute 'sort_values'

最佳答案

我想你可以使用 Series.nlargest :

print (df.groupby(['lead_sponsor', 'lead_sponsor_class']).size().nlargest(10))

docs注释:

Faster than .sort_values(ascending=False).head(n) for small n relative to the size of the Series object.

示例:

import pandas as pd

df = pd.DataFrame({'id': {0: 2837692, 1: 2837679, 2: 2837666, 3: 2837653, 4: 2837640},
'lead_sponsor': {0: 'a', 1: 'a', 2: 'a', 3: 's', 4: 's'},
'lead_sponsor_class': {0: 'Industry', 1: 'Other', 2: 'Other', 3: 'Other', 4: 'Other'}})

print (df)
id lead_sponsor lead_sponsor_class
0 2837692 a Industry
1 2837679 a Other
2 2837666 a Other
3 2837653 s Other
4 2837640 s Other

print (df.groupby(['lead_sponsor', 'lead_sponsor_class']).size())
lead_sponsor lead_sponsor_class
a Industry 1
Other 2
s Other 2
dtype: int64

print (df.groupby(['lead_sponsor', 'lead_sponsor_class']).size().sort_values(ascending=False).head(2))
lead_sponsor lead_sponsor_class
s Other 2
a Other 2
dtype: int64

print (df.groupby(['lead_sponsor', 'lead_sponsor_class']).size().nlargest(2))
lead_sponsor lead_sponsor_class
a Other 2
s Other 2
dtype: int64

关于python - 列出 Pandas 组中最常见的成员?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39141080/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com