gpt4 book ai didi

python - Pandas - 比较行中的列 ID 并有条件删除

转载 作者:行者123 更新时间:2023-12-01 00:07:42 29 4
gpt4 key购买 nike

在示例数据框中,例如:

Qid     Sid     L1  L2
id01 id02 74 72
id01 id03 74 68
id02 id01 72 74
id02 id03 72 68

我想删除相互点击,所以输出应该是:

Qid     Sid     L1  L2
id01 id02 74 72
id01 id03 74 68
id02 id03 72 68

在我的真实数据集中,我有数千行,上面只是为了解释这个想法。

最佳答案

这是另一个想法:

import pandas as pd
import numpy as np
data = {'Qid':['id01','id01','id02','id02'],'Sid':['id02','id02','id01','id03'],'L1':[74,74,72,72],'L2':[72,68,74,68]}
df = pd.DataFrame(data)
df[['L1','L2']] = df[['L1','L2']].astype(str) #Turn the values into strings so you can create sortable list over it.
df['aux'] = df[['Qid','Sid','L1','L2']].values.tolist() #create a list of the 4 columns
df['aux'] = df['aux'].apply(sorted).astype(str) #sort the list and treat it as a full string.
df = df.drop_duplicates(subset='aux').drop(columns='aux') #drop the rows where the list is duplicate, that is, there is the same combination of Qid, Sid, L1 and L2.
print(df)

输出:

    Qid   Sid  L1  L2
0 id01 id02 74 72
1 id01 id02 74 68
3 id02 id03 72 68

关于python - Pandas - 比较行中的列 ID 并有条件删除,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59846844/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com