gpt4 book ai didi

python - Pandas 中具有多索引的高级横截面

转载 作者:行者123 更新时间:2023-11-28 22:49:08 25 4
gpt4 key购买 nike

我有以下数据框:

lb = [('A','a',1), ('A','a',2), ('A','a',3), ('A','b',1), ('A','b',2), ('A','b',3), ('B','a',1), ('B','a',2), ('B','a',3), ('B', 'b',1), ('B','b',2) ,('B','b',3)]
col = pd.MultiIndex.from_tuples(lb, names=['first','second','third'])
df = pd.DataFrame(randn(5,12), columns=col)

first A B \
second a b a
third 1 2 3 1 2 3 1
0 1.597958 2.054695 0.449745 -0.990393 0.780978 -0.590558 -0.691706
1 -0.093841 -1.203769 1.779555 -0.299931 -0.411360 0.122852 -0.250156
2 0.025183 0.514480 -0.420666 1.574669 0.962010 1.278237 -0.976286
3 -1.028288 -0.506581 0.880370 1.513487 -0.066479 -0.100231 0.785042
4 -1.635642 0.464074 -0.335941 -0.034194 0.412519 -0.672058 0.113886

first
second b
third 2 3 1 2 3
0 1.954769 0.705860 -1.712058 1.015807 1.245232
1 -2.037299 -0.120649 -0.114652 -0.686707 -0.993540
2 0.918084 -0.892378 -0.741131 -2.547121 0.797637
3 0.000077 2.123063 0.903571 1.972190 -1.179325
4 -1.145241 -1.773182 0.407046 -0.301640 -0.173261

我想获取所有带有2和3的列,也就是类似的东西

df.xs([2,3], level='third', axis=1, drop_level=False)

但这行不通。我该如何继续?

最佳答案

这是 0.14.0 中的新功能,请参阅 whatsnew here .这有效地取代了对 .xs 的需要。

In [8]: idx = pd.IndexSlice

In [9]: df.loc[:,idx[:,:,[2,3]]]
Out[9]:
first A B
second a b a b
third 2 3 2 3 2 3 2 3
0 1.770120 -0.362269 -0.804352 1.549652 0.069858 -0.274113 0.570410 -0.460956
1 -0.982169 2.044497 0.571353 0.310634 -1.865966 -0.862613 0.124413 0.645419
2 -1.412519 0.168448 0.081467 -0.220464 1.033748 1.561429 0.094363 0.254768
3 -0.653458 -0.978661 0.158708 -0.818675 -1.122577 0.026941 2.678548 0.864817
4 -0.555179 -0.155564 1.148956 1.438523 -1.254660 0.609254 -0.970612 1.519028

减去这个很重要。

[107]: df = pd.DataFrame(np.arange(5*12).reshape(-1,12), columns=col)

In [108]: df
Out[108]:
first A B
second a b a b
third 1 2 3 1 2 3 1 2 3 1 2 3
0 0 1 2 3 4 5 6 7 8 9 10 11
1 12 13 14 15 16 17 18 19 20 21 22 23
2 24 25 26 27 28 29 30 31 32 33 34 35
3 36 37 38 39 40 41 42 43 44 45 46 47
4 48 49 50 51 52 53 54 55 56 57 58 59

Pandas 想要对齐 rhs 边(毕竟你正在减去不同的索引),所以你需要手动广播这个。这是一个关于此的问题:https://github.com/pydata/pandas/issues/7475

In [109]: df.loc[:,idx[:,:[2,3]]] = df.loc[:,idx[:,:,[2,3]]]-np.tile(df.loc[:,idx[:,:,1]].values,2)
Out[109]:
first A B
second a b a b
third 2 3 2 3 2 3 2 3
0 1 -1 -2 -4 7 5 4 2
1 1 -1 -2 -4 7 5 4 2
2 1 -1 -2 -4 7 5 4 2
3 1 -1 -2 -4 7 5 4 2
4 1 -1 -2 -4 7 5 4 2

关于python - Pandas 中具有多索引的高级横截面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24258781/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com