gpt4 book ai didi

python - Pandas : Group by and sort by total size

转载 作者:行者123 更新时间:2023-12-01 04:18:15 25 4
gpt4 key购买 nike

假设我有这个结果

group1 = df.groupby(['first_column', 'second_column'], as_index=False).size()

first_column second_column
A A1 1
A2 2
B B1 1
B2 2
B3 3

然后我希望它计算first_column的总大小并将其显示为这样的内容

first_column    second_column       
A A1 1 3
A2 2
B B1 1 6
B2 2
B3 3

根据总大小,我希望将其排序为总大小前 10 名。我怎样才能做这样的事情?也可以为列命名。像这样

first_column    second_column   size    total_size

更新1

数据框应该是这样的。

df.head()

first_column second_column
0 A A1
1 A A2
2 A A2
3 B B1
4 B B2
5 B B2
6 B B3
7 B B3
8 B B3

最佳答案

代码注释应该是不言自明的。

# Sample data.
df = pd.DataFrame({'first_column': ['A']*3 + ['B']*6, 'second_column': ['A1'] + ['A2']*2 + ['B1'] + ['B2']*2 + ['B3']*3})

# Create initial groupby, rename column to 'size' and reset index.
gb = df.groupby(['first_column', 'second_column'], as_index=False).size()
gb.name = 'size'
gb = gb.reset_index()

>>> gb
first_column second_column size
0 A A1 1
1 A A2 2
2 B B1 1
3 B B2 2
4 B B3 3

# Use transform to sum the `size` by the first column only.
gb['total_size'] = gb.groupby('first_column')['size'].transform('sum')

>>> gb
first_column second_column size total_size
0 A A1 1 3
1 A A2 2 3
2 B B1 1 6
3 B B2 2 6
4 B B3 3 6

关于python - Pandas : Group by and sort by total size,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34057808/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com