gpt4 book ai didi

python - Pandas 数据框条件替换和列修剪

转载 作者:太空宇宙 更新时间:2023-11-03 21:20:55 24 4
gpt4 key购买 nike

Current Pandas DataFrame

fn1 = pd.DataFrame([['A', 'NaN', 'NaN', 9, 6], ['B', 'NaN', 2, 'NaN', 7], ['C', 3, 2, 'NaN', 10], ['D', 'NaN', 7, 'NaN', 'NaN'], ['E', 'NaN', 'NaN', 3, 3], ['F', 'NaN', 'NaN', 7,'NaN']], columns = ['Symbol', 'Condition1','Condition2', 'Condition3', 'Condition4'])

fn1.set_index('Symbol', inplace=True)



Condition1 Condition2 Condition3 Condition4
Symbol
A NaN NaN 9 6
B NaN 2 NaN 7
C 3 2 NaN 10
D NaN 7 NaN NaN
E NaN NaN 3 3
F NaN NaN 7 NaN

我目前正在使用一个 Pandas DataFrame,它看起来像上面的链接。我试图逐列将非“NaN”的值替换为与该行关联的“符号”,然后折叠每列(或写入新的数据帧),以便每列都是“符号”的列表每个“条件”都存在,如所需输出所示:

Desired Output

我已经能够将每个条件存在的“符号”放入列表列表中(见下文),但希望保持相同的列名称,并且无法将它们添加到不断增长的新 DataFrame 中,因为长度是可变的,我循环遍历列。

ls2 = []
for col in fn1.columns:
fn2 = fn1[fn1[col] > 0]
ls2.append(list(fn2.index))

其中 fn1 是看起来像第一个图像的 DataFrame,我已将“Symbol”列设为索引。

预先感谢您的帮助。

最佳答案

另一个答案是切片,如下所示(评论中的解释):

import numpy as np
import pandas as pd

df = pd.DataFrame.from_dict({
"Symbol": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k"],
"Condition1": [1, np.nan, 3, np.nan, np.nan, np.nan, 7, np.nan, np.nan, 8, 12],
"Condition2": [np.nan, 2, 2, 7, np.nan, np.nan, 5, 11, 14, np.nan, np.nan],
}
)


new_df = pd.concat(
[
df["Symbol"][df[column].notnull()].reset_index(drop=True) # get columns without null and ignore the index (as your output suggests)
for column in list(df)[1:] # Iterate over all columns except "Symbols"
],
axis=1, # Column-wise concatenation
)
# Rename columns
new_df.columns = list(df)[1:]
# You can leave NaNs or replace them with empty string, your choice
new_df.fillna("", inplace=True)

此操作的输出将是:

  Condition1 Condition2
0 a b
1 c c
2 g d
3 j g
4 k h
5 i

如果您需要任何进一步的说明,请在下面发表评论。

关于python - Pandas 数据框条件替换和列修剪,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54261116/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com