gpt4 book ai didi

python - 在 Pandas 中应用分组依据后获取最大计数的行值

转载 作者:太空宇宙 更新时间:2023-11-04 09:39:20 26 4
gpt4 key购买 nike

我有以下 df

>In [260]: df
>Out[260]:
size market vegetable confirm availability
0 Large ABC Tomato NaN
1 Large XYZ Tomato NaN
2 Small ABC Tomato NaN
3 Large ABC Onion NaN
4 Small ABC Onion NaN
5 Small XYZ Onion NaN
6 Small XYZ Onion NaN
7 Small XYZ Cabbage NaN
8 Large XYZ Cabbage NaN
9 Small ABC Cabbage NaN

1)如何获取最大尺寸的蔬菜的尺寸?

我在蔬菜和大小上使用了 groupby 来获得以下 df但是我需要获取包含最大大小的行蔬菜

In [262]: df.groupby(['vegetable','size']).count()
Out[262]: market confirm availability
vegetable size
Cabbage Large 1 0
Small 2 0
Onion Large 1 0
Small 3 0
Tomato Large 2 0
Small 1 0

df2['vegetable','size'] = df.groupby(['vegetable','size']).count().apply( some logic )

必需的 Df :

  vegetable   size   max_count
0 Cabbage Small 2
1 Onion Small 3
2 Tomato Large 2

2) 现在我可以说 df 有大量的“小卷心菜”。所以我需要用 small 填充所有卷心菜行的确认可用性列如何做到这一点?

    size market vegetable  confirm availability
0 Large ABC Tomato Large
1 Large XYZ Tomato Large
2 Small ABC Tomato Large
3 Large ABC Onion Small
4 Small ABC Onion Small
5 Small XYZ Onion Small
6 Small XYZ Onion Small
7 Small XYZ Cabbage Small
8 Large XYZ Cabbage Small
9 Small ABC Cabbage Small

最佳答案

1)

required_df = veg_df.groupby(['vegetable','size'], as_index=False)['market'].count()\
.sort_values(by=['vegetable', 'market'])\
.drop_duplicates(subset='vegetable', keep='last')

2)

merged_df = veg_df.merge(required_df, on='vegetable')
cols = ['size_x', 'market_x', 'vegetable', 'size_y']
dict_renaming_cols = {'size_x': 'size',
'market_x': 'market',
'size_y': 'confirm_availability'}
merged_df = merged_df.loc[:,cols].rename(columns=dict_renaming_cols)

关于python - 在 Pandas 中应用分组依据后获取最大计数的行值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52243060/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com