gpt4 book ai didi

python - 根据条件提取pandas列名

转载 作者:行者123 更新时间:2023-12-03 10:39:42 25 4
gpt4 key购买 nike

基于 pandas DataFrame df,我进行了排名,可以在 rank_df 中看到。

现在,我想创建一个新的 DataFrame 结果,其中包含三列["first", "second", "third"]。此 DataFrame 应填充 rank_df 的相应列名称。例如,结果的第一行可能包含['ticker_3', 'ticker_1', 'ticker_4']。换句话说,results 的列 first 应始终包含排名最高的rank_df 列名称。等等...

import numpy as np
import pandas as pd

np.random.seed(123)

cols = ["ticker_" + str(i + 1) for i in range(5)]
df = pd.DataFrame(np.random.rand(3, 5), columns=cols)
df

输出:

   ticker_1  ticker_2  ticker_3  ticker_4  ticker_5
0 0.696469 0.286139 0.226851 0.551315 0.719469
1 0.423106 0.980764 0.684830 0.480932 0.392118
2 0.343178 0.729050 0.438572 0.059678 0.398044

生成rank_df:

rank_df = df.rank(axis=1, method="first", ascending=False)
rank_df

输出:

   ticker_1  ticker_2  ticker_3  ticker_4  ticker_5
0 2.0 4.0 5.0 3.0 1.0
1 4.0 1.0 2.0 3.0 5.0
2 4.0 1.0 2.0 5.0 3.0

需要生成结果,

# NaNs in this final DataFrame needs to be filled with the respective column names
results = pd.DataFrame(None, index=rank_df.index, columns=["first", "second", "third"])

最佳答案

IIUC,您可以尝试使用argsort:

print(df)
ticker_1 ticker_2 ticker_3 ticker_4 ticker_5
0 0.548814 0.715189 0.602763 0.544883 0.423655
1 0.645894 0.437587 0.891773 0.963663 0.383442
2 0.791725 0.528895 0.568045 0.925597 0.071036

results[:] = df.columns.to_numpy()[np.argsort(-df)][:,:3] #change 3 to n as reqd
print(results)

      first    second     third
0 ticker_2 ticker_3 ticker_1
1 ticker_4 ticker_3 ticker_1
2 ticker_4 ticker_1 ticker_3

关于python - 根据条件提取pandas列名,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61562029/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com