gpt4 book ai didi

python - 显示每个月 DataFrame 的前 5 个最大值

转载 作者:行者123 更新时间:2023-12-04 09:51:51 27 4
gpt4 key购买 nike

我正在尝试处理包含很多列 (505) 的数据框,并且我只想选择每个月的前 5 个值。
您将在我的 DataFrame 图像的链接下方找到。

link photo

这是示例:

  Dates         1        2       3           4       5     6
2002-07-31 -31.710916 NaN -5.208684 -29.773404 NaN -7.308558
2002-08-31 -44.941351 NaN 3.665286 -23.987135 NaN 3.134669
2002-09-30 -36.725548 NaN 4.114474 -19.536571 NaN -0.986986
2002-10-31 -25.377286 NaN -0.486158 -5.887594 NaN -0.787117
2002-11-30 19.766328 NaN -5.298877 -10.672174 NaN -21.057946
2002-12-31 1.996514 NaN -7.570497 -9.257122 NaN -19.630112
2003-01-31 -0.366083 NaN -14.124492 -5.434475 NaN -8.053424
2003-02-28 -17.869297 NaN -20.075997 1.009837 NaN -11.616974

我该怎么做?我已经尝试过 df.max(axis=1) 但我想在最大值之后添加 4 个其他值。
谢谢你的帮助

最佳答案

我假设您希望每行最多 5 列,因为这是我解释您的问题的方式。以下选择示例输入中最多 2 行,因为它只有 4 个非 nan 列。

import io
import re
import pandas as pd


# First read in the data you supplied.
data=io.StringIO(re.sub(" +","\t",
"""Dates 1 2 3 4 5 6
2002-07-31 -31.710916 NaN -5.208684 -29.773404 NaN -7.308558
2002-08-31 -44.941351 NaN 3.665286 -23.987135 NaN 3.134669
2002-09-30 -36.725548 NaN 4.114474 -19.536571 NaN -0.986986
2002-10-31 -25.377286 NaN -0.486158 -5.887594 NaN -0.787117
2002-11-30 19.766328 NaN -5.298877 -10.672174 NaN -21.057946
2002-12-31 1.996514 NaN -7.570497 -9.257122 NaN -19.630112
2003-01-31 -0.366083 NaN -14.124492 -5.434475 NaN -8.053424
2003-02-28 -17.869297 NaN -20.075997 1.009837 NaN -11.616974"""))
df = pd.read_csv(data,sep="\t")

# Then we preprocess the data, so it is in a long format instead of a wide
df = df.melt(id_vars='Dates',var_name='Column_name',value_name='Value')

# Finally extract the top 2 values for each date, but first set the index so the output knows what column the input came from
print(df.set_index('Column_name').groupby('Dates')['Value'].apply(lambda grp: grp.nlargest(2)))

输出是
Dates       Column_name
2002-07-31 3 -5.208684
6 -7.308558
2002-08-31 3 3.665286
6 3.134669
2002-09-30 3 4.114474
6 -0.986986
2002-10-31 3 -0.486158
6 -0.787117
2002-11-30 1 19.766328
3 -5.298877
2002-12-31 1 1.996514
3 -7.570497
2003-01-31 1 -0.366083
4 -5.434475
2003-02-28 4 1.009837
6 -11.616974
Name: Value, dtype: float64

除非您对自己想要的输出变得更加明确,否则很难给出更合适的答案。

关于python - 显示每个月 DataFrame 的前 5 个最大值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62004150/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com