gpt4 book ai didi

python - 从列中删除重复的单词

转载 作者:行者123 更新时间:2023-12-05 04:22:38 25 4
gpt4 key购买 nike

我有一个像这样的 dataframe:

df3 = pd.DataFrame({'ID': ['Stay home, T5006, T5006, Stay home', 'Go for walk, T5007, T5007, Go for walk'],
'Name': ['Stay home, Go for walk, Stay home', 'Go outside, Go outside, Go outside']
})


ID Name
0 Stay home, T5006, T5006, Stay home Stay home, Go for walk, Stay home
1 Go for walk, T5007, T5007, Go for walk Go outside, Go outside, Go outside

我想从 ID 列中删除重复项。预期结果:

    ID                  Name
0 Stay home,T5006 Stay home, Go for walk, Stay home
1 Go for walk,T5007 Go outside, Go outside, Go outside

有什么想法吗?

最佳答案

使用 dict.fromkey 技巧删除拆分值的重复项,然后在 lambda 函数中通过 , 加入:

df3['ID'] = df3['ID'].apply(lambda x: ', '.join(dict.fromkeys(x.split(', '))))

或者使用列表理解:

df3['ID'] = [', '.join(dict.fromkeys(x.split(', '))) for x in df3['ID']]

print (df3)
ID Name
0 Stay home, T5006 Stay home, Go for walk, Stay home
1 Go for walk, T5007 Go outside, Go outside, Go outside

如果可能的顺序不重要,请使用set:

df3['ID'] = df3['ID'].apply(lambda x: ', '.join(set(x.split(', '))))
df3['ID'] = [', '.join(set(x.split(', '))) for x in df3['ID']]
print (df3)
ID Name
0 Stay home, T5006 Stay home, Go for walk, Stay home
1 T5007, Go for walk Go outside, Go outside, Go outside

关于python - 从列中删除重复的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73945687/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com