gpt4 book ai didi

python - Pandas:从数据框的一行中取一个标签并将其转换为列名

转载 作者:太空宇宙 更新时间:2023-11-04 02:51:17 25 4
gpt4 key购买 nike

我有以下示例:

import numpy as np
import pandas as pd

feature_labels = ["A", "B", "C"]
n_days = 3
n_persons = 2
n_features = len(feature_labels)

data = pd.DataFrame({
"day": np.repeat(list(np.arange(3))*n_persons, n_days),
"person": np.repeat(np.arange(2), n_days*n_features),
"feature": feature_labels*(n_days*n_persons),
"value": np.random.rand(n_features*n_days*n_persons)
})
data

它返回:

    day feature  person     value
0 0 A 0 0.519279
1 0 B 0 0.243156
2 0 C 0 0.093231
3 1 A 0 0.046888
4 1 B 0 0.775699
5 1 C 0 0.757114
6 2 A 0 0.983894
7 2 B 0 0.709877
8 2 C 0 0.256220
9 0 A 1 0.823253
10 0 B 1 0.014050
11 0 C 1 0.740373
12 1 A 1 0.554485
13 1 B 1 0.828009
14 1 C 1 0.398025
15 2 A 1 0.033659
16 2 B 1 0.904537
17 2 C 1 0.649851

我需要得到一个包含以下列的数据表:daypersonABC,并包含相应的值。如果您能告诉我如何使用 pandas 的 API 来做到这一点,我将不胜感激。

最佳答案

In [325]: data.set_index(["day", "person", "feature"])['value'] \
.unstack('feature').reset_index().rename_axis(None, 1)
Out[325]:
day person A B C
0 0 0 0.852395 0.975006 0.884853
1 0 1 0.044862 0.505431 0.376252
2 1 0 0.359508 0.598859 0.354796
3 1 1 0.592805 0.629942 0.142600
4 2 0 0.340190 0.178081 0.237694
5 2 1 0.933841 0.946380 0.602297

解释:

如果我们在执行 .unstack() 之前不指定 ['value'],我们将得到多级列,因为通常我们可以有多个非-“取消堆叠”时的索引列,因此 Pandas 用列名“标记”它:

In [328]: data.set_index(["day", "person", "feature"]).unstack('feature')
Out[328]:
value
feature A B C
day person
0 0 0.852395 0.975006 0.884853
1 0.044862 0.505431 0.376252
1 0 0.359508 0.598859 0.354796
1 0.592805 0.629942 0.142600
2 0 0.340190 0.178081 0.237694
1 0.933841 0.946380 0.602297

In [329]: data.set_index(["day", "person", "feature"])['value'].unstack('feature')
Out[329]:
feature A B C
day person
0 0 0.852395 0.975006 0.884853
1 0.044862 0.505431 0.376252
1 0 0.359508 0.598859 0.354796
1 0.592805 0.629942 0.142600
2 0 0.340190 0.178081 0.237694
1 0.933841 0.946380 0.602297

.rename_axis(None, axis=1) 帮助我们摆脱 feature(“列”轴的名称):

In [334]: x = data.set_index(["day", "person", "feature"])['value'].unstack('feature').reset_index()

In [335]: x.columns
Out[335]: Index(['day', 'person', 'A', 'B', 'C'], dtype='object', name='feature')
# NOTE: ^^^^^^^

In [336]: x = x.rename_axis(None, axis=1)

In [337]: x.columns
Out[337]: Index(['day', 'person', 'A', 'B', 'C'], dtype='object')

关于python - Pandas:从数据框的一行中取一个标签并将其转换为列名,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43788097/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com