gpt4 book ai didi

python - 如何按顺序合并多个数据帧?

转载 作者:行者123 更新时间:2023-12-03 07:50:37 27 4
gpt4 key购买 nike

虽然我认为这个问题应该重复,但我找不到正确的答案。

我在按顺序合并多个数据帧时遇到一些问题。

例如,我有四个数据框,如下所示:

df1 = pd.DataFrame({'source': ['A', 'A', 'A', 'B', 'B', 'C', 'C'],
'target': ['1', '2', '3', '4', '5', '6', '7']})
df2 = pd.DataFrame({'source': ['A', 'A'],
'temp': ['a', 'b']})
df3 = pd.DataFrame({'source': ['B', 'B'],
'temp': ['c', 'd']})
df4 = pd.DataFrame({'source': ['C'],
'temp': ['e']})

我想合并数据框,如下所示:

#   source  target  temp
#0 A 1 a
#1 A 1 b
#2 A 2 a
#3 A 2 b
#4 A 3 a
#5 A 3 b
#6 B 4 c
#7 B 4 d
#8 B 5 c
#9 B 5 d
#10 C 6 e
#11 C 7 e

为此,我尝试运行代码,但它返回了意外结果。

#Trial 1
dfs = pd.merge(df1, df2, on='source', how='left')
dfs = pd.merge(dfs, df3, on='source', how='left') # new column was created with prefix, but I want to keep the three columns; source, target, temp

#Trial 2
dfs = pd.merge(df1, df2, on='source', how='left')
dfs['temp']=dfs.set_index('source')['temp'].fillna(df3.set_index('source')['temp'].to_dict()).values # it only fills the fixed number of NaN value, but there are some exception; one NaN in dfs, multiple values in other df3 or df4

#Trial 3
dfs = pd.merge(df1, df2, on='source', how='left')
dfs[dfs['source']=='B']['temp']=pd.merge(df1, df3, on='source', how='left')['temp'].dropna() # it didn't change the dfs

最佳答案

这不是一个简单的合并。您想要连接 df2、df3、df4,然后与 df1 合并:

df1.merge(pd.concat([df2,df3,df4]).drop_duplicates(), on='source')

输出:

   source target temp
0 A 1 a
1 A 1 b
2 A 2 a
3 A 2 b
4 A 3 a
5 A 3 b
6 B 4 c
7 B 4 d
8 B 5 c
9 B 5 d
10 C 6 e
11 C 7 e

关于python - 如何按顺序合并多个数据帧?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/77216542/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com