gpt4 book ai didi

python - 根据特定条件将 pandas 中的 2 个字符串列组合成一个新列的最佳方法是什么?

转载 作者:行者123 更新时间:2023-12-01 07:44:54 26 4
gpt4 key购买 nike

我有一个 pandas 数据框,每列中都有字符串值。我想将第 1 列和第 2 列合并到一个新列中,比如说第 4 列。但是,如果第 1 列和第 2 列中的单词相同,我想将第 1 列和第 3 列合并到新列中。

我尝试先将对放入列表中,然后将其作为单独的列,但没有成功。我是 python 新手,所以我认为我缺少一个更简单的解决方案。

pairs = []
for row in df['interest1']:
if row == df['interest2'].iloc[row]:
pairs.append(df['interest1'] + ' ' + df['interest2'])
else:
pairs.append(df['interest1'] + ' ' + df['interest3'])
#a simple example of what I would like to achieve

import pandas as pd

lst= [['music','music','film','music film'],
['guitar','piano','violin','guitar piano'],
['music','photography','photography','music photography'],
]

df= pd.DataFrame(lst,columns=['interest1','interest2','interest3','first distinct pair'])
df

最佳答案

您可以使用 pandas 数据帧的 where 方法,

df['first_distinct_pair'] = (df['interest1'] + df['interest2']).where(df['interest1'] != df['interest2'],  df['interest1'] + df['interest3'])

如果你想包含空格,你可以这样做:

df['first_distinct_pair'] = (df['interest1'] + ' '+ df['interest2']).where(df['interest1'] != df['interest2'],  df['interest1'] + ' ' + df['interest3'])

结果看起来像:

 import pandas as pd
...:
...: lst= [['music','music','film'],
...: ['guitar','piano','violin'],
...: ['music','photography','photography'],
...: ]
...:
...: df= pd.DataFrame(lst,columns=['interest1','interest2','interest3'])

>>> df['first_distinct_pair'] = (df['interest1'] + ' '+ df['interest2']).where(df['interest1'] != df['interest2'], df['interest1'] + ' ' + df['interest3'])

>>> df
interest1 interest2 interest3 first_distinct_pair
0 music music film music film
1 guitar piano violin guitar piano
2 music photography photography music photography

关于python - 根据特定条件将 pandas 中的 2 个字符串列组合成一个新列的最佳方法是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56510019/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com