gpt4 book ai didi

python - 如何从两个表创建二进制标签

转载 作者:行者123 更新时间:2023-11-28 21:42:49 24 4
gpt4 key购买 nike

我正在尝试分析 Pandas 数据的趋势。我有两个表,如果该行中的 UID 和 PID 存在于另一个表中,我想在一个表中创建一个新的二进制列。我目前拥有的表格的一个例子是:

>>> df_a = pd.DataFrame({"UID": [123, 456, 789, 012], "PID": [12, 55, 56, 89], "TIM": [76, 54, 21, 25]})
>>> df_a
PID TIM UID
0 12 76 123
1 55 54 456
2 56 21 789
3 89 25 010

>>> df_b = pd.DataFrame({'UID': [221, 012, 653, 456], 'PID': [17, 89, 51, 55], 'FOO': [2347, 32447, 3234, 7999]})
>>> df_b
FOO PID UID
0 2347 17 221
1 32447 89 010
2 3234 51 653
3 7999 55 456

我希望最终结果是:

>>> df_a
PID TIM UID PUR
0 12 76 123 0
1 55 54 456 1
2 56 21 789 0
3 89 25 010 1

但我不确定具体该怎么做。我认为 left join 是可行的方法,但我也无法实现。任何帮助将不胜感激

最佳答案

您可以将左连接与 joinmerge 一起使用,然后测试 FOO 列,如果不是 NaNboolean mask ,它被转换为 0,1 来自 astype :

df_a['PUR'] = df_a.join(df_b.set_index(['PID','UID']), on=['PID','UID'])['FOO']
.notnull().astype(int)
print (df_a)
PID TIM UID PUR
0 12 76 123 0
1 55 54 456 1
2 56 21 789 0
3 89 25 12 1

df_a['PUR'] = pd.merge(df_a, df_b, how='left', on=['PID','UID'])['FOO'].notnull().astype(int)
print (df_a)
PID TIM UID PUR
0 12 76 123 0
1 55 54 456 1
2 56 21 789 0
3 89 25 12 1

另一种解决方案是通过 isin 进行测试:

df_a['PUR']  = df_a.set_index('PID')['UID'].isin(df_b.set_index('PID')['UID'])
.astype(int).values
print (df_a)
PID TIM UID PUR
0 12 76 123 0
1 55 54 456 1
2 56 21 789 0
3 89 25 12 1

编辑:

看来两列都需要 drop_duplicates:

#added duplicates
df_b = pd.DataFrame({'UID': [221, 12, 456, 456],
'PID': [17, 89, 55, 55],
'FOO': [2347, 32447, 3234, 7999]})
print (df_b)
FOO PID UID
0 2347 17 221
1 32447 89 12
2 3234 55 456 <-duplicates by both columns
3 7999 55 456 <-duplicates by both columns

df_b = df_b.drop_duplicates(['PID','UID'])
df_a['PUR'] = df_a.join(df_b.set_index(['PID','UID']), on=['PID','UID'])['FOO']
.notnull().astype(int)
print (df_a)
PID TIM UID PUR
0 12 76 123 0
1 55 54 456 1
2 56 21 789 0
3 89 25 12 1

关于python - 如何从两个表创建二进制标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43132104/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com