gpt4 book ai didi

python - 带有循环结果的文本文件生成

转载 作者:太空宇宙 更新时间:2023-11-03 16:59:38 26 4
gpt4 key购买 nike

我有一个包含 32 篇文章的文本文件。我设法使用以下代码找到每篇文章:

import re 
sections = []
current = []
with open("Aberdeen2005.txt") as f:
for line in f:
if re.search(r"(?i)\d+ of \d+ DOCUMENTS", line):
sections.append("".join(current))
current = [line]
else:
current.append(line)

print(len(sections))

接下来我做的是查看有多少文章包含我感兴趣的关键字:税收和政策。在这一行中,如果文章有,我会提取月份:

months=['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'Novemeber', 'December']


for i in range(len(sections)):

if (' tax ' in sections[i]
or ' Tax ' in sections[i]
or ' policy ' in sections[i]
or ' Policy ' in sections[i]):

pat=re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M)
month = pat.search("\n".join(sections[i].splitlines()[0:6]))
print(month)

最后但并非最不重要的一点是,我想创建一个包含先前找到的月份的文本文件:

outfile = open('C:/Users/nn/Desktop/Uncertainty_Scot/dates.txt', 'w')
outfile.write(month.group(0))
outfile.close

问题就在这里,它只产生最后一个月的结果。我猜是因为它不在循环中,有什么想法如何做到这一点吗?

亲切的问候!

最佳答案

您只需将循环包装在输出文件的 with 循环中,如下所示:

months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

with open(r'C:\Users\nn\Desktop\Uncertainty_Scot\dates.txt', 'w') as outfile:
for i in range(len(sections)):
if (' tax ' in sections[i] or ' Tax ' in sections[i] or ' policy ' in sections[i] or ' Policy ' in sections[i]):
pat = re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M)
month = pat.search("\n".join(sections[i].splitlines()[0:6]))
print(month)
outfile.write(month.group(0))

您可以通过执行以下操作来进一步改进循环:

months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

with open('C:/Users/nn/Desktop/Uncertainty_Scot/dates.txt', 'w') as outfile:
for s in sections:
if any(x in s.lower() for x in [' tax ', ' policy ']:
pat = re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M)
month = pat.search("\n".join(s.splitlines()[0:6]))
print(month)
outfile.write(month.group(0))

通过首先转换为小写,您只需测试字符串的一个版本,然后它还会捕获 "TAX " 形式的条目。

关于python - 带有循环结果的文本文件生成,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35101222/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com