gpt4 book ai didi

python - Pandas 分层索引和计算

转载 作者:太空宇宙 更新时间:2023-11-03 13:29:00 24 4
gpt4 key购买 nike

给定:

df = pd.DataFrame({"panum": ["PA1", "PA1", "PA1", "PA2", "PA2", "PA2"], 
"which": ["A", "A", "A", "B", "B", "B"],
"score": [88, 80, 90, 92, 95, 99]})

df.set_index(['panum', 'which'], inplace=True)
df

score
panum which
PA1 A 88
A 80
A 90
PA2 B 92
B 95
B 99

是否可以编写一些东西来在“which”中创建一个名为 max 的新索引条目,这将是 max 但对于级别,因此它会创建两个新行,PA1,Max 和 PA2,Max?

更新

我已经更正了索引。上面的例子不是我的意思。

panmum  factor  score
PA1 init 90
resub 94
final 93
PA2 init 60
resub 90
final 88

在这个更好的场景中,我的问题是:“我想创建一个名为 mean 的新“panum”,它将包含三行,(mean, init), (mean, resub), (mean, final)” .

伪代码类似于 df['mean'] = (df['pa1'] + df['pa2'])/2

我知道这是一个不同的问题!

最佳答案

您可以创建max 值的新DataFrame,添加第二级maxappend到原始和最后sort_index :

m = df.max(level=0).assign(max='max').set_index('max', append=True)
print (m)
score
panum max
PA1 max 90
PA2 max 99

df = df.append(m).sort_index()
print (df)
score
panum which
PA1 A 88
A 80
A 90
max 90
PA2 B 92
B 95
B 99
max 99

编辑答案:解决方案更改为第二级 meanswaplevel 以正确对齐最终 DataFrame:

df = pd.DataFrame({"panum": ["PA1", "PA1", "PA1", "PA2", "PA2", "PA2"], 
"factor": ["init", "resub", "final"] * 2,
"score": [90, 94, 93, 60, 90, 88]})

df.set_index(['panum', 'factor'], inplace=True)
print (df)
score
panum factor
PA1 init 90
resub 94
final 93
PA2 init 60
resub 90
final 88

m = (df.mean(level=1)
.assign(factor='mean')
.set_index('factor', append=True)
.swaplevel(0,1))
print (m)
score
factor factor
mean init 75.0
resub 92.0
final 90.5

df = df.append(m)
print (df)
score
panum factor
PA1 init 90.0
resub 94.0
final 93.0
PA2 init 60.0
resub 90.0
final 88.0
mean init 75.0
resub 92.0
final 90.5

关于python - Pandas 分层索引和计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50292297/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com