gpt4 book ai didi

python - 如何在数据框中查找任意位置包含单个字符的句子

转载 作者:行者123 更新时间:2023-11-30 21:51:34 24 4
gpt4 key购买 nike

我正在尝试从包含带有一个字符的单词的数据框中打印句子,无论它是在句子的开头中间还是结尾处,我尝试的代码是

lookfor = '[' + re.escape("A-Za-z") + ']'

tdata = pd.read_csv(fileinput, nrows=0).columns[0]
skip = int(tdata.count(' ') == 0)
tdata = pd.read_csv(fileinput, names=['sentences'], skiprows=skip)



filtered = tdata[tdata.sentences.str.contains(lookfor, regex=True, na=False)]
print(filtered)

#a sample set
-----------------------------

#hi, how are; you z
#im w good thanks
#How am I
#good, what about you
#my name is alex
#K hello, alex how are you !
#it is a car
#great news
#thanks!
-----------------------------

expected output

-----------------------------
#hi, how are; you z
#im w good thanks
#How am I
#K hello, alex how are you !
#it is a car
-----------------------------

即使我在查找数组中写下了所有字母,它也不起作用,它会打印包含这些字母的任何句子,而不是当它们单独出现时有任何想法?

最佳答案

使用Series.str.contains用一个带有单词边界的单词并按 boolean indexing 进行过滤:

df = df[df['sentences'].str.contains(r'\b\w{1}\b')]
print (df)
sentences
0 hi, how are; you z
1 im w good thanks
2 How am I
5 K hello, alex how are you !
6 it is a car

编辑:要排除AI,您可以在比较之前使用replace:

df = df[df['sentences'].str.replace(r'\b[AI]\b', '').str.contains(r'\b\w{1}\b')]
print (df)
sentences
0 hi, how are; you z
1 im w good thanks
5 K hello, alex how are you !
6 it is a car

或者:

df = df[~df['sentences'].str.contains(r'\b[AI]\b') & 
df['sentences'].str.contains(r'\b\w{1}\b')]
print (df)
sentences
0 hi, how are; you z
1 im w good thanks
5 K hello, alex how are you !
6 it is a car

关于python - 如何在数据框中查找任意位置包含单个字符的句子,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60090107/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com