gpt4 book ai didi

python - 将多索引系列转入 DataFrame

转载 作者:行者123 更新时间:2023-12-01 09:17:37 24 4
gpt4 key购买 nike

来自如下所示的初始多索引 DataFrame:

import numpy as np
import pandas as pd
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
print(df)
0 1 2 3
bar one -1.111899 -0.673956 -0.045719 -0.654951
two 0.761249 1.009988 1.718598 1.461674
baz one -1.128029 0.360159 -0.004877 -0.725785
two -0.007996 1.183093 1.651100 -1.408199
foo one 0.935349 0.816100 1.043749 -0.575600
two 0.986057 0.790675 -0.302731 1.434262
qux one 0.564661 -2.821966 0.650187 -0.176112
two -1.353135 0.192120 -0.314343 -1.242303

我只需要通过以下方式提取的第一列:

series = df[0]
print(series)
bar one -1.111899
two 0.761249
baz one -1.128029
two -0.007996
foo one 0.935349
two 0.986057
qux one 0.564661
two -1.3531354
type(series)
<class 'pandas.core.series.Series'>

如何将这个系列转换为以下 DataFrame:

       bar       baz        foo       qux
one -1.111899 -1.128029 0.935349 0.564661
two 0.761249 -0.007996 0.986057 -1.353135

请注意,我并不坚持中间的第二步。重要的是获取结果 DataFrame。

最佳答案

您需要添加unstack按第一级:

np.random.seed(123)
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
print (df)
0 1 2 3
bar one -1.085631 0.997345 0.282978 -1.506295
two -0.578600 1.651437 -2.426679 -0.428913
baz one 1.265936 -0.866740 -0.678886 -0.094709
two 1.491390 -0.638902 -0.443982 -0.434351
foo one 2.205930 2.186786 1.004054 0.386186
two 0.737369 1.490732 -0.935834 1.175829
qux one -1.253881 -0.637752 0.907105 -1.428681
two -0.140069 -0.861755 -0.255619 -2.798589

df1 = df[0].unstack(level=0)
print (df1)
bar baz foo qux
one -1.085631 1.265936 2.205930 -1.253881
two -0.578600 1.491390 0.737369 -0.140069

另一个解决方案是首先对列中的MultiIndex进行unstack,然后按DataFrame.xs进行选择:

df1 = df.unstack(level=0)
print (df1)
0 1 \
bar baz foo qux bar baz foo
one -1.085631 1.265936 2.205930 -1.253881 0.997345 -0.866740 2.186786
two -0.578600 1.491390 0.737369 -0.140069 1.651437 -0.638902 1.490732

2 3 \
qux bar baz foo qux bar baz
one -0.637752 0.282978 -0.678886 1.004054 0.907105 -1.506295 -0.094709
two -0.861755 -2.426679 -0.443982 -0.935834 -0.255619 -0.428913 -0.434351


foo qux
one 0.386186 -1.428681
two 1.175829 -2.798589

#more general solution
df2 = df1.xs(0, level=0, axis=1)
#if need seelct first level only
#df2 = df1[0]
print (df2)
bar baz foo qux
one -1.085631 1.265936 2.205930 -1.253881
two -0.578600 1.491390 0.737369 -0.140069

关于python - 将多索引系列转入 DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51095468/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com