gpt4 book ai didi

python - 与列表相比,从数据框中提取字符串

转载 作者:太空宇宙 更新时间:2023-11-04 08:44:30 25 4
gpt4 key购买 nike

我正在尝试从 pandas 数据框中的 DF 中提取字符串,源字符串位于我必须匹配的列表中。我尝试使用 df.str.extract(list1) 但我收到不可散列类型的错误我猜我将列表与 DF 进行比较的方式不正确

From

Col 1   Col 2
1 The date
2 Three has come
3 Mail Sent
4 Done Deal

To

Col 1   Col 2           Col 3 
1 The date NaN
2 Three has come Three has
3 Mail Sent Mail
4 Done Deal Done

我的 list 如下

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']

最佳答案

您可以使用 extract通过| join List 中的所有值是什么意思 orregex 中:

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']
df['Col 3'] = df['Col 2'].str.extract("(" + "|".join(List1) +")", expand=False)
print (df)
Col 1 Col 2 Col 3
0 1 The date NaN
1 2 Three has come Three has
2 3 Mail Sent Mail
3 4 Done Deal Done

另一种解决方案:

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']

df['Col 3'] = df['Col 2'].apply(lambda x: ''.join([L for L in List1 if L in x]))
df['Col 3'] = df['Col 3'].mask(df['Col 3'] == '')
print (df)
Col 1 Col 2 Col 3
0 1 The date NaN
1 2 Three has come Three has
2 3 Mail Sent Mail
3 4 Done Deal Done

关于python - 与列表相比,从数据框中提取字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41954822/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com