gpt4 book ai didi

python - 在每个 pandas 数据框行中查找最高值列的名称——包括绑定(bind)值

转载 作者:行者123 更新时间:2023-12-05 00:44:41 24 4
gpt4 key购买 nike

我有一个数据框,记录了不同人拥有的水果的数量和类型。我想添加一列,指示每个人的顶级水果。如果一个人有 2 个以上的顶级水果(又名领带),我想要一个列表(或元组)。

输入

例如,假设我的输入是这个数据框:

# Create all the fruit data
data = [{'fruit0':'strawberry','fruit0_count':23,'fruit1':'orange','fruit1_count':4,'fruit2':'grape','fruit2_count':27},
{'fruit0':'apple','fruit0_count':45,'fruit1':'mango','fruit1_count':45,'fruit2':'orange','fruit2_count':12},
{'fruit0':'blueberry','fruit0_count':30,'fruit1':'grapefruit','fruit1_count':32,'fruit2':'cherry','fruit2_count':94},
{'fruit0':'pineapple','fruit0_count':4,'fruit1':'grape','fruit1_count':4,'fruit2':'lemon','fruit2_count':67}]

# Add people's names as an index
df = pd.DataFrame(data, index=['Shawn', 'Monica','Jamal','Tracy'])

# Print the dataframe
df

. . .创建输入数据框:
        fruit0      fruit0_count    fruit1      fruit1_count    fruit2  fruit2_count
Shawn strawberry 23 orange 4 grape 27
Monica apples 45 mango 45 orange 12
Jamal blueberry 30 grapefruit 32 cherry 94
Tracy pineapple 4 grape 4 lemon 67

目标输出

我想要的是一个新的列,它给出了每个人的顶级水果的名称。如果此人有两个(或更多)水果并列第一,我想要这些水果的列表或元组:
        fruit0      fruit0_count    fruit1      fruit1_count    fruit2  fruit2_count    top_fruit
Shawn strawberry 23 orange 4 grape 27 grape
Monica apple 45 mango 45 orange 12 (apple,mango)
Jamal blueberry 30 grapefruit 32 cherry 94 cherry
Tracy pineapple 4 grape 4 lemon 67 lemon

我的尝试远

我得到的最接近的是基于 https://stackoverflow.com/a/38955365/6480859 .

问题:
  • 如果顶级水果有平局,它只会捕获一个顶级水果(莫妮卡的顶级水果只有苹果。)
  • 这真的很复杂。不是真的问题,但是如果有更直接的路径,我想学习它。
  • # List the columns that contain count numbers
    cols = ['fruit0_count', 'fruit1_count', 'fruit2_count']

    # Make a new dataframe with just those columns.
    only_counts_df=pd.DataFrame()
    only_counts_df[cols]=df[cols].copy()

    # Indicate how many results you want. Note: If you increase
    # this from 1, it gives you the #2, #3, etc. ranking -- it
    # doesn't represent tied results.
    nlargest = 1

    # The next two lines are suggested from
    # https://stackoverflow.com/a/38955365/6480859. I don't totally
    # follow along . . .
    order = np.argsort(-only_counts_df.values, axis=1)[:, :nlargest]
    result = pd.DataFrame(only_counts_df.columns[order],
    columns=['top{}'.format(i) for i in range(1, nlargest+1)],
    index=only_counts_df.index)

    # Join the results back to our original dataframe
    result = df.join(result).copy()

    # The dataframe now reports the name of the column that
    # contains the top fruit. Convert this to the fruit name.
    def id_fruit(row):
    if row['top1'] == 'fruit0_count':
    return row['fruit0']
    elif row['top1'] == 'fruit1_count':
    return row['fruit1']
    elif row['top1'] == 'fruit2_count':
    return row['fruit2']
    else:
    return "Failed"
    result['top_fruit'] = result.apply(id_fruit,axis=1)
    result = result.drop(['top1'], axis=1).copy()
    result

    . . .输出:
            fruit0      fruit0_count    fruit1      fruit1_count    fruit2  fruit2_count    top_fruit
    Shawn strawberry 23 orange 4 grape 27 grape
    Monica apple 45 mango 45 orange 12 apple
    Jamal blueberry 30 grapefruit 32 cherry 94 cherry
    Tracy pineapple 4 grape 4 lemon 67 lemon

    莫妮卡的顶级水果应该是苹果和芒果。

    欢迎任何提示,谢谢!

    最佳答案

    想法是过滤每一对并将列取消配对到 df1df2 , 然后比较 max 的值并使用 DataFrame.mask 过滤, 最后得到 apply 中的非缺失值:

    df1 = df.iloc[:, ::2]
    df2 = df.iloc[:, 1::2]
    mask = df2.eq(df2.max(axis=1), axis=0)

    df['top'] = df1.where(mask.to_numpy()).apply(lambda x: x.dropna().tolist(), axis=1)
    print (df)
    fruit0 fruit0_count fruit1 fruit1_count fruit2 \
    Shawn strawberry 23 orange 4 grape
    Monica apple 45 mango 45 orange
    Jamal blueberry 30 grapefruit 32 cherry
    Tracy pineapple 4 grape 4 lemon

    fruit2_count top
    Shawn 27 [grape]
    Monica 12 [apple, mango]
    Jamal 94 [cherry]
    Tracy 67 [lemon]

    关于python - 在每个 pandas 数据框行中查找最高值列的名称——包括绑定(bind)值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59501133/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com