gpt4 book ai didi

python - 合并具有分层列的两个数据框

转载 作者:行者123 更新时间:2023-12-04 15:17:29 26 4
gpt4 key购买 nike

这是我第一次在 pandas 中使用多索引,我需要一些帮助来将两个数据框与分层列合并。这是我的两个数据框:

col_index = pd.MultiIndex.from_product([['a', 'b', 'c'], ['w', 'x']])
df1 = pd.DataFrame(np.ones([4,6]),columns=col_index, index=range(4))

a b c
w x w x w x
0 1.0 1.0 1.0 1.0 1.0 1.0
1 1.0 1.0 1.0 1.0 1.0 1.0
2 1.0 1.0 1.0 1.0 1.0 1.0
3 1.0 1.0 1.0 1.0 1.0 1.0

df2 = pd.DataFrame(np.zeros([2,6]),columns=col_index, index=range(2))

a b c
w x w x w x
0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0

当我使用合并方法时,我得到以下结果:

pd.merge(df1,df2, how='left', suffixes=('', '_2'), left_index = True, right_index= True ))

a b c a_2 b_2 c_2
w x w x w x w x w x w x
0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
2 1.0 1.0 1.0 1.0 1.0 1.0 NaN NaN NaN NaN NaN NaN
3 1.0 1.0 1.0 1.0 1.0 1.0 NaN NaN NaN NaN NaN NaN

但我想在较低级别上合并两个数据帧,后缀在 ['w', 'x'] 上生效,如下所示:

     a                   b                   c               
w w_2 x x_2 w w_2 x x_2 w w_2 x x_2
0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
2 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN
3 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN

最佳答案

您可以将joinmergeswaplevel()reorder_levels 一起使用。然后使用 .sort_index() 并传递 axis=1 以按索引列排序。

  • .join() 当您像这样对索引进行合并时会更好。
  • .swaplevel() 在有两个级别(如本例)时更好,而 .reorder_levels() 对于 3 个或更多级别更好。

以下是这些方法的 4 种组合。对于这个具体的例子,我认为 .join()/.swaplevel() 是最泛泛的(见最后一个例子):

df3 = (df1.reorder_levels([1,0],axis=1)
.join(df2.reorder_levels([1,0],axis=1), rsuffix='_2')
.reorder_levels([1,0],axis=1).sort_index(axis=1, level=[0, 1]))
df3
Out[1]:
a b c
w w_2 x x_2 w w_2 x x_2 w w_2 x x_2
0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
2 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN
3 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN

df3 = (pd.merge(df1.reorder_levels([1,0],axis=1),
df2.reorder_levels([1,0],axis=1),
how='left', left_index=True, right_index=True, suffixes = ('', '_2'))
.reorder_levels([1,0],axis=1).sort_index(axis=1, level=[0, 1]))
df3
Out[2]:
a b c
w w_2 x x_2 w w_2 x x_2 w w_2 x x_2
0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
2 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN
3 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN

df3 = (pd.merge(df1.swaplevel(axis=1),
df2.swaplevel(axis=1),
how='left', left_index=True, right_index=True, suffixes = ('', '_2'))
.swaplevel(axis=1).sort_index(axis=1, level=[0, 1]))
df3
Out[3]:
a b c
w w_2 x x_2 w w_2 x x_2 w w_2 x x_2
0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
2 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN
3 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN

df3 = (df1.swaplevel(i=0,j=1, axis=1)
.join(df2.swaplevel(axis=1), rsuffix='_2')
.swaplevel(axis=1).sort_index(axis=1, level=[0, 1]))
df3
Out[4]:
a b c
w w_2 x x_2 w w_2 x x_2 w w_2 x x_2
0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
2 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN
3 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN 1.0 NaN

关于python - 合并具有分层列的两个数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64056064/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com