gpt4 book ai didi

python - Pandas :折叠多索引数据框中的行

转载 作者:行者123 更新时间:2023-12-04 01:13:14 28 4
gpt4 key购买 nike

下面是我的 df:

df = pd.DataFrame({'A': [1, 1, 1, 2],
'B': [2, 2, 2, 3],
'C': [3, 3, 3, 4],
'D': ['Cancer A', 'Cancer B', 'Cancer A', 'Cancer B'],
'E': ['Ecog 9', 'Ecog 1', 'Ecog 0', 'Ecog 1'],
'F': ['val 6', 'val 1', 'val 0', 'val 1'],
'measure_m': [100, 200, 500, 300]})

print(df)

A B C D E F measure_m
0 1 2 3 Cancer A Ecog 9 val 6 100
1 1 2 3 Cancer B Ecog 1 val 1 200
2 1 2 3 Cancer A Ecog 0 val 0 500
3 2 3 4 Cancer B Ecog 1 val 1 300

当我 pivot 这个 df 而不传递索引时,我得到这个:

In [1280]: df.pivot(index=None, columns = ['A', 'B', 'C', 'D', 'E', 'F'])
Out[1280]:
measure_m
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 NaN NaN NaN
1 NaN 200.0 NaN NaN
2 NaN NaN 500.0 NaN
3 NaN NaN NaN 300.0

我想要的不是 4 行,而是 1 单行,其中包含 measure_m 列的所有值,如下所示:

  measure_m                           
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100.0 200.0 500.0 300.0

如何解决这个问题?

最佳答案

你的意思是:

df.set_index(list(df.columns[:-1])).T

输出:

A                1                          2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
measure_m 100 200 500 300

更新一点修改以匹配您的输出:

cols = ['A', 'B', 'C', 'D', 'E', 'F']

(df.set_index(cols)
[['measure_m']] # only need this if you have more columns
.unstack(level=cols)
.to_frame().T
)

输出:

  measure_m                           
A 1 2
B 2 3
C 3 4
D Cancer A Cancer B Cancer A Cancer B
E Ecog 9 Ecog 1 Ecog 0 Ecog 1
F val 6 val 1 val 0 val 1
0 100 200 500 300

关于python - Pandas :折叠多索引数据框中的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64157447/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com