gpt4 book ai didi

Python:在读取文件并对一行中的单词进行计数时,我想将 ""或 ' ' 之间的单词计算为一个单词

转载 作者:行者123 更新时间:2023-12-05 02:35:50 26 4
gpt4 key购买 nike

我有一个文件,我必须在其中计算每一行中的单词数,但是有一个技巧,无论出现在 ' ' 还是 ""之间,都应该算作一个单词。

示例文件:

TopLevel  
DISPLAY "In TopLevel. Starting to run program"
PERFORM OneLevelDown
DISPLAY "Back in TopLevel."
STOP RUN.

对于上述文件,每行的字数必须如下所示:

Line: 1 has: 1 words  
Line: 2 has: 2 words
Line: 3 has: 2 words
Line: 4 has: 2 words
Line: 5 has: 2 words

但我得到如下:

Line: 1 has: 1 words  
Line: 2 has: 7 words
Line: 3 has: 2 words
Line: 4 has: 4 words
Line: 5 has: 2 words
from os import listdir
from os.path import isfile, join

srch_dir = r'C:\Users\sagrawal\Desktop\File'

onlyfiles = [srch_dir+'\\'+f for f in listdir(srch_dir) if isfile(join(srch_dir, f))]

for i in onlyfiles:
index = 0
with open(i,mode='r') as file:
lst = file.readlines()
for line in lst:
cnt = 0
index += 1
linewrds=line.split()
for lwrd in linewrds:
if lwrd:
cnt = cnt +1
print('Line:',index,'has:',cnt,' words')

最佳答案

如果您只有这种简单格式(没有嵌套引号或转义引号),您可以使用简单的正则表达式:

lines = '''TopLevel  
DISPLAY "In TopLevel. Starting to run program"
PERFORM OneLevelDown
DISPLAY "Back in TopLevel."
STOP RUN.'''.split('\n')

import re
counts = [len(re.findall(r'\'.*?\'|".*?"|\S+', l))
for l in lines]
# [1, 2, 2, 2, 2]

如果没有,你必须写一个解析器

关于Python:在读取文件并对一行中的单词进行计数时,我想将 ""或 ' ' 之间的单词计算为一个单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70444937/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com