gpt4 book ai didi

Python Pandas : Merge or Filter DataFrame by Another. 有没有更好的方法?

转载 作者:行者123 更新时间:2023-11-28 17:34:28 24 4
gpt4 key购买 nike

我有时会遇到的一种情况是,我有两个数据帧( df1df2 ),我想根据 df3df1 之间的多个列的交集创建一个新的数据帧( df2 )。

例如,我想通过按列 df3df1 过滤 Campaign 来创建 Group

import pandas as pd
df1 = pd.DataFrame({'Campaign':['Campaign 1', 'Campaign 2', 'Campaign 3', 'Campaign 3', 'Campaign 4'], 'Group':['Some group', 'Arbitrary Group', 'Group 1', 'Group 2', 'Done Group'], 'Metric':[245,91,292,373,32]}, columns=['Campaign', 'Group', 'Metric'])
df2 = pd.DataFrame({'Campaign':['Campaign 3', 'Campaign 3'], 'Group':['Group 1', 'Group 2'], 'Metric':[23, 456]}, columns=['Campaign', 'Group', 'Metric'])

df1

     Campaign            Group  Metric
0 Campaign 1 Some group 245
1 Campaign 2 Arbitrary Group 91
2 Campaign 3 Group 1 292
3 Campaign 3 Group 2 373
4 Campaign 4 Done Group 32

df2

     Campaign    Group  Metric
0 Campaign 3 Group 1 23
1 Campaign 3 Group 2 456

我知道我可以通过合并来做到这一点...

df3 = df1.merge(df2, how='inner', on=['Campaign', 'Group'], suffixes=('','_del'))
#df3
Campaign Group Metric Metric_del
0 Campaign 3 Group 1 292 23
1 Campaign 3 Group 2 373 456

但随后我必须弄清楚如何对以 drop 结尾的列进行 _del 处理。我猜是这样的:

df3.select(lambda x: not re.search('_del', x), axis=1)
##The result I'm going for but required merge, then select (2-steps)
Campaign Group Metric
0 Campaign 3 Group 1 292
1 Campaign 3 Group 2 373

问题

我主要感兴趣的是返回仅根据 df1df2 值过滤的 Campaign|Group

  1. 有没有更好的方法来返回 df1 而无需求助于 merge

  2. 有没有一种方法可以实现 merge 但不将 df2 的列返回到 merge 而只返回 df1 的列?

最佳答案

假设您的 df1df2 具有完全相同的列。您可以先将这些连接键列设置为索引,然后使用 df1.reindex(df2.index) 和进一步的 .dropna() 来生成交集。

df3 = df1.set_index(['Campaign', 'Group'])
df4 = df2.set_index(['Campaign', 'Group'])
# reindex first and dropna will produce the intersection
df3.reindex(df4.index).dropna(how='all').reset_index()

Campaign Group Metric
0 Campaign 3 Group 1 292
1 Campaign 3 Group 2 373

编辑:

当键不唯一时使用.isin

# create some duplicated keys and values
df3 = df3.append(df3)
df4 = df4.append(df4)

# isin
df3[df3.index.isin(df4.index)].reset_index()

Campaign Group Metric
0 Campaign 3 Group 1 292
1 Campaign 3 Group 2 373
2 Campaign 3 Group 1 292
3 Campaign 3 Group 2 373

关于Python Pandas : Merge or Filter DataFrame by Another. 有没有更好的方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31925572/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com