gpt4 book ai didi

python - 合并多索引数据帧

转载 作者:行者123 更新时间:2023-12-01 03:59:54 26 4
gpt4 key购买 nike

考虑以下两个 DataFrame:

arrays1 = [['foo', 'bar', 'bar', 'bar'],
['A', 'D', 'E', 'F']]
tuples1 = list(zip(*arrays1))
columnValues1 = pd.MultiIndex.from_tuples(tuples1)
df1 = pd.DataFrame(np.random.rand(4,4), columns = columnValues1)
print(df1)
foo bar
A D E F
0 0.833444 0.354676 0.468294 0.173005
1 0.409730 0.275342 0.595433 0.322785
2 0.515161 0.340063 0.117509 0.491957
3 0.285594 0.970524 0.322902 0.628351

arrays2 = [['foo', 'foo', 'bar', 'bar'],
['B', 'C', 'G', 'H']]
tuples2 = list(zip(*arrays2))
columnValues2 = pd.MultiIndex.from_tuples(tuples2)
df2 = pd.DataFrame(np.random.rand(4,4), columns = columnValues2)
print(df2)
foo bar
B C G H
0 0.208822 0.762884 0.424412 0.583324
1 0.767560 0.884583 0.716843 0.329719
2 0.147991 0.424748 0.560599 0.828155
3 0.376050 0.436354 0.704379 0.406324

假设我想合并这些来得到这个:

          foo                                bar                
A B C D E F G H
0 0.833444 0.208822 0.762884 0.354676 0.468294 0.173005 0.424412 0.583324
1 0.409730 0.767560 0.884583 0.275342 0.595433 0.322785 0.716843 0.329719
2 0.515161 0.147991 0.424748 0.340063 0.117509 0.491957 0.560599 0.828155
3 0.285594 0.376050 0.436354 0.970524 0.322902 0.628351 0.704379 0.406324

我尝试过合并:

pd.merge(df1.reset_index(), df2.reset_index(), on=df1.columns.levels[0], 
how='inner').set_index(df1.columns.levels[0])

不幸的是,我收到以下错误消息:

ValueError: The truth value of an array with more than one element is ambiguous. 
Use a.any() or a.all()

如何合并 2 个 MultiIndex DataFrame?`

最佳答案

更新:动态选择列:

In [57]: join = df1.join(df2)

In [58]: cols = join.columns.get_level_values(0).unique()

In [59]: cols
Out[59]: array(['foo', 'bar'], dtype=object)

In [60]: join = join[cols]

In [61]: join
Out[61]:
foo bar \
A B C D E F G
0 0.176934 0.694937 0.947164 0.510407 0.085626 0.162183 0.382840
1 0.973283 0.743907 0.886495 0.028961 0.740759 0.330742 0.961932
2 0.898224 0.966278 0.131551 0.517563 0.026104 0.624047 0.848640
3 0.713660 0.704461 0.419997 0.718130 0.252294 0.336838 0.016916


H
0 0.929695
1 0.444762
2 0.338168
3 0.635817

joined = df1.join(df2)[['foo','bar']]

说明:

您可以先加入您的 DF:

In [47]: join = df1.join(df2)

In [48]: join
Out[48]:
foo bar foo bar \
A D E F B C G
0 0.176934 0.510407 0.085626 0.162183 0.694937 0.947164 0.382840
1 0.973283 0.028961 0.740759 0.330742 0.743907 0.886495 0.961932
2 0.898224 0.517563 0.026104 0.624047 0.966278 0.131551 0.848640
3 0.713660 0.718130 0.252294 0.336838 0.704461 0.419997 0.016916


H
0 0.929695
1 0.444762
2 0.338168
3 0.635817

然后按所需顺序选择列(级别:0):

In [49]: join = join[['foo','bar']]

In [50]: join
Out[50]:
foo bar \
A B C D E F G
0 0.176934 0.694937 0.947164 0.510407 0.085626 0.162183 0.382840
1 0.973283 0.743907 0.886495 0.028961 0.740759 0.330742 0.961932
2 0.898224 0.966278 0.131551 0.517563 0.026104 0.624047 0.848640
3 0.713660 0.704461 0.419997 0.718130 0.252294 0.336838 0.016916


H
0 0.929695
1 0.444762
2 0.338168
3 0.635817

关于python - 合并多索引数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36771843/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com