gpt4 book ai didi

python - 使用散列从数据框中删除列

转载 作者:行者123 更新时间:2023-12-01 03:39:30 24 4
gpt4 key购买 nike

给定两个 pandas 数据框:

df1 = pd.read_csv(file1, names=['col1','col2','col3'])
df2 = pd.read_csv(file2, names=['col1','col2','col3'])

我想删除 df2 中 df1 中不存在 col1col2(或两者)值的所有行。

执行以下操作:

df2 = df2[(df2['col1'] in set(df1['col1'])) & (df2['col2'] in set(df1['col2']))]

产量:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

最佳答案

我想你可以尝试isin :

df2 = df2[(df2['col1'].isin(df1['col1'])) & (df2['col2'].isin(df1['col2']))]

df1 = pd.DataFrame({'col1':[1,2,3,3],
'col2':[4,5,6,2],
'col3':[7,8,9,5]})

print (df1)
col1 col2 col3
0 1 4 7
1 2 5 8
2 3 6 9
3 3 2 5

df2 = pd.DataFrame({'col1':[1,2,3,5],
'col2':[4,7,4,1],
'col3':[7,8,9,1]})

print (df2)
col1 col2 col3
0 1 4 7
1 2 7 8
2 3 4 9
3 5 1 1

df2 = df2[(df2['col1'].isin(df1['col1'])) & (df2['col2'].isin(df1['col2'].unique()))]
print (df2)
col1 col2 col3
0 1 4 7
2 3 4 9

另一个解决方案是 merge ,因为内部联接 (how='inner') 是默认设置,但它仅适用于两个 DataFrames 中具有相同位置的值:

print (pd.merge(df1, df2))
col1 col2 col3
0 1 4 7

关于python - 使用散列从数据框中删除列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39871559/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com