gpt4 book ai didi

python - 如何使用删除 NA 值的选项来融化 Pandas 中的数据框

转载 作者:行者123 更新时间:2023-11-28 22:29:34 25 4
gpt4 key购买 nike

我有一个像这样的 Pandas 数据框:

df = pd.DataFrame({"VAR1":["V1","V2","V2","V3","V4","V4","V5"], "VAR2":["C1","C1","C1","C2","C2","C2","C3"], "VAR3":["S1","S2","S3","S4","","",""], "VAR4":["","S3","S4","S5","S6","",""], "VAR5":["","S7","","","","","S3"]})

df

我必须将其转换为如下所示的数据框

VAR1  VAR2  VALUE
V1 C1 S1
V2 C1 S2
V2 C1 S3
V2 C1 S7
V3 C2 S4
V3 C2 S5
V4 C2 S6
V5 C3 S3

也就是说,我想根据它们与 VAR1 和 VAR2 的映射将 VAR、VAR4、VAR5 列融合到一个列中

最佳答案

您可以使用 meltboolean indexing对于删除空值的行,然后是 sort_values在列和最后 reset_index对于默认的单调唯一 index:

df = pd.melt(df, id_vars=['VAR1', 'VAR2'], value_name='VALUE').drop('variable', axis=1)
df = df[df.VALUE != ''].sort_values(['VAR1','VAR2']).reset_index(drop=True)
print (df)
VAR1 VAR2 VALUE
0 V1 C1 S1
1 V2 C1 S2
2 V2 C1 S3
3 V2 C1 S3
4 V2 C1 S4
5 V2 C1 S7
6 V3 C2 S4
7 V3 C2 S5
8 V4 C2 S6
9 V5 C3 S3

如果需要也可以drop_duplicates :

df = pd.melt(df, id_vars=['VAR1', 'VAR2'], value_name='VALUE').drop('variable', axis=1)
df = df[df.VALUE != ''].drop_duplicates().sort_values(['VAR1','VAR2']).reset_index(drop=True)
print (df)
VAR1 VAR2 VALUE
0 V1 C1 S1
1 V2 C1 S2
2 V2 C1 S3
3 V2 C1 S4
4 V2 C1 S7
5 V3 C2 S4
6 V3 C2 S5
7 V4 C2 S6
8 V5 C3 S3

关于python - 如何使用删除 NA 值的选项来融化 Pandas 中的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42933696/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com