gpt4 book ai didi

pandas - 分组、拆分和选取数据框中的顶行

转载 作者:行者123 更新时间:2023-12-04 08:34:29 29 4
gpt4 key购买 nike

问题
在以下数据框中 df :

import random
import pandas as pd
random.seed(999)
sz = 50

qty = {'one': 1, 'two': 2, 'three': 3}

thing = (random.choice(['one', 'two', 'three']) for _ in range(sz))
order = (random.choice(['ascending', 'descending']) for _ in range(sz))
value = (random.randint(0, 100) for _ in range(sz))

df = pd.DataFrame({'thing': thing, 'order': order, 'value': value})
... 我想要:
  • thing 分组
  • order 拆分
  • value 排序为 thing根据其 order
  • 捡顶qty为此thing

  • 预期结果
        thing       order  value
    0 one ascending 17
    1 one descending 1
    2 two ascending 28
    3 two ascending 30
    4 two descending 13
    5 two descending 38
    6 three ascending 6
    7 three ascending 27
    8 three ascending 35
    9 three descending 4
    10 three descending 5
    11 three descending 6
    手动编码以通过以下方式获得结果:
    one_a = df[(df.thing == 'one') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['one'])
    one_d = df[(df.thing == 'one') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['one'])
    two_a = df[(df.thing == 'two') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['two'])
    two_d = df[(df.thing == 'two') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['two'])
    three_a = df[(df.thing == 'three') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['three'])
    three_d = df[(df.thing == 'three') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['three'])

    print(pd.concat([one_a, one_d, two_a, two_d, three_a, three_d], ignore_index=True))

    是否可以使用 groupby 来实现这一点? , sort_valuesset_index ?

    最佳答案

    一个问题是选择ascendingdescending分别地。我们可以通过反转 descending 来解决这个问题:

    df.loc[df.order=='descending','value']*= -1

    s=(df.sort_values('value').groupby(['thing','order'])
    .cumcount()
    .reindex(df.index)
    )

    out = df[s<df['thing'].map(qty)].sort_values(['thing','order'])
    out.loc[out.order=='descending', 'value'] *= 1
    输出:
        thing       order  value
    14 one ascending 17
    27 one descending 1
    13 three ascending 6
    17 three ascending 35
    38 three ascending 27
    4 three descending 5
    23 three descending 4
    37 three descending 6
    21 two ascending 28
    42 two ascending 30
    6 two descending 38
    9 two descending 13

    关于pandas - 分组、拆分和选取数据框中的顶行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64864630/

    29 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com