gpt4 book ai didi

python - pandas 多索引选择与列条件

转载 作者:行者123 更新时间:2023-12-02 06:40:14 25 4
gpt4 key购买 nike

我在过滤多索引数据帧时遇到问题。

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(6, 6),
columns=pd.MultiIndex.from_arrays((['Team_1','Team_1','Team_1','Team_1','Team_1','Team_1'],
['A','A','A','B','B','B'], ['a', 'b', 'c', 'a', 'b', 'c'])))

看起来像:

     Team_1                                                  
A B
a b c a b c
0 1.663478 1.121481 -0.675905 -1.286932 -0.713381 0.835101
1 0.076587 1.334063 -1.804435 -0.892450 -0.349493 -1.448643
2 0.485618 0.675481 -0.488584 -0.354583 1.827532 -1.184389
3 -0.531397 -0.145830 -1.143331 -0.871459 -0.009081 -1.741627
4 0.355948 -2.275475 0.543201 -0.099087 -1.114334 -1.248298
5 1.448409 -0.974127 2.004364 -0.880845 1.195134 0.392949

我想要的是输出 df哪里ab有一些要求,如果两者都为真,则包括 c专栏也是如此。例如,给出削减的输出 a>0b<0想要喜欢

     Team_1                                                  
A B
a b c a b c
0 1.663478 NaN NaN NaN -0.713381 NaN
1 0.076587 NaN NaN NaN -0.349493 NaN
2 0.485618 NaN NaN -0.354583 NaN NaN
3 NaN -0.145830 NaN NaN -0.009081 NaN
4 0.355948 -2.275475 0.543201 NaN -1.114334 NaN
5 1.448409 -0.974127 2.004364 NaN NaN NaN

首先,我可以进行基本选择 (df.iloc[:, df.columns.get_level_values(2) == 'a'] > 0)但不知道从那里去哪里。

最佳答案

另一种选择是使用stack/unstack

result = (
df.stack(level=[0, 1])
.assign(
c=lambda df: np.where(
(df["a"] > 0) & (df["b"] < 0),
df["c"], np.nan
)
a=lambda df: np.where(
df["a"] > 0, df["a"], np.nan
),
b=lambda df: np.where(
df["b"] < 0, df["b"], np.nan
)
).unstack(level=[1, 2])
.reorder_levels([1, 2, 0], axis=1)
.sort_index(level=1, axis=1)
)

如果我们从一个 df 开始,如下所示:

     Team_1                                                  
A B
a b c a b c
0 0.622728 -1.059337 0.154738 -1.118633 0.336635 1.173941
1 0.166443 0.236547 0.690746 0.169085 -0.107237 -0.539768
2 -1.270542 0.525559 0.335747 0.455872 -0.523938 0.508105
3 1.964184 0.281073 0.567805 0.012256 2.773986 -0.900674
4 1.997804 -0.621523 -0.253128 1.867092 0.134846 2.729482
5 0.860470 -0.293951 -1.581081 -2.014744 1.357025 -1.007692

输出结果将为:

     Team_1                                                  
A B
a b c a b c
0 0.622728 -1.059337 0.154738 NaN NaN NaN
1 0.166443 NaN NaN 0.169085 -0.107237 -0.539768
2 NaN NaN NaN 0.455872 -0.523938 0.508105
3 1.964184 NaN NaN 0.012256 NaN NaN
4 1.997804 -0.621523 -0.253128 1.867092 NaN NaN
5 0.860470 -0.293951 -1.581081 NaN NaN NaN

关于python - pandas 多索引选择与列条件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59668982/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com