gpt4 book ai didi

python - 如何根据短语存在创建新列?

转载 作者:行者123 更新时间:2023-11-28 21:37:51 24 4
gpt4 key购买 nike

我想根据短语存在创建新列

这是我的数据

No   Body
1 Office software is already paid
2 Excel software is not paid yet
3 Power point software is already paid

我想根据某个短语的存在进行分类,这是我的代码,

countries1 = df.body.str.extract('(software|is already paid)', expand = False)
dummies1 = pd.get_dummies(countries1)
df_1 = pd.concat([df,dummies1],axis = 1)

结果是

No   Body                                   software   is already paid    
1 Office software is already paid 0 1
2 Excel software is not paid yet 1 0
3 Power point software is already paid 0 1

我期望的是

No   Body                                   software   is already paid    
1 Office software is already paid 1 1
2 Excel software is not paid yet 1 0
3 Power point software is already paid 1 1

我的代码有什么问题?或者我没有使用正确的功能

最佳答案

让我们尝试使用 extractall:

df.assign(**df.Body.str.extractall('(software|is already paid)')[0]
.str.get_dummies().sum(level=0))

输出:

   No                                  Body  is already paid  software
0 1 Office software is already paid 1 1
1 2 Excel software is not paid yet 0 1
2 3 Power point software is already paid 1 1

关于python - 如何根据短语存在创建新列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48859714/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com