gpt4 book ai didi

python - 根据正则表达式结果创建一个包含 0 和 1 值的新列

转载 作者:太空宇宙 更新时间:2023-11-04 02:37:03 25 4
gpt4 key购买 nike

我的数据框有值:

data_df

0 student
1 sample text
2 student
3 no students
4 sample texting
5 random sample

我使用正则表达式提取包含单词“student”的行,结果如下:

regexdf
0 student
2 student

我的目标是在主数据框中创建一个包含 0 和 1 值的新列。即第 0 行应为 1,第 5 行应为零。(因为“regexdf”在第 0 行和第 2 行中有“student”)如何匹配两者中的索引并创建列?

最佳答案

使用正则表达式:

data_df = data_df.assign(regexdf = data_df[1].str.extract(r'(student)\b', expand=False))
data_df['student'] = data_df['regexdf'].notnull().mul(1)
print(data_df)

输出:

                 1  regexdf  student
0 student student 1
1 sample text NaN 0
2 student student 1
3 no students NaN 0
4 sample texting NaN 0
5 random sample NaN 0

编辑

df_out = data_df.join(regexdf, rsuffix='regex')

df_out['pattern'] = df_out['1regex'].notnull().mul(1)

df_out['Count_Pattern'] = df_out['pattern'].cumsum()

print(df_out)

输出:

                1   1regex  pattern  Count_Pattern
0 student student 1 1
1 sample text NaN 0 1
2 student student 1 2
3 no students NaN 0 2
4 sample texting NaN 0 2
5 random sample NaN 0 2

关于python - 根据正则表达式结果创建一个包含 0 和 1 值的新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47637494/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com