gpt4 book ai didi

python - 为什么会出现 "List index out of range"错误?

转载 作者:太空宇宙 更新时间:2023-11-04 03:49:04 25 4
gpt4 key购买 nike

所以我有一个文件列表,我想通读并打印出这些信息。它一直给我错误 list index out of range。不知道出了什么问题。对于第 2 行,如果我添加 matches[:10] 它可能适用于前 10 个文件。但我需要它来处理所有文件。检查了一些旧帖子,但仍然无法让我的代码工作。

re.findall 在我分段编写这段代码之前工作。不确定它不再工作了。谢谢。

import re, os
topdir = r'E:\Grad\LIS\LIS590 Text mining\Part1\Part1' # Topdir has to be an object rather than a string, which means that there is no paranthesis.
matches = []
for root, dirnames, filenames in os.walk(topdir):
for filename in filenames:
if filename.endswith(('.txt','.pdf')):
matches.append(os.path.join(root, filename))

capturedorgs = []
capturedfiles = []
capturedabstracts = []
orgAwards={}
for filepath in matches:
with open (filepath,'rt') as mytext:
mytext=mytext.read()

matchOrg=re.findall(r'NSF\s+Org\s+\:\s+(\w+)',mytext)[0]
capturedorgs.append(matchOrg)

# code to capture files
matchFile=re.findall(r'File\s+\:\s+(\w\d{7})',mytext)[0]
capturedfiles.append(matchFile)

# code to capture abstracts
matchAbs=re.findall(r'Abstract\s+\:\s+(\w.+)',mytext)[0]
capturedabstracts.append(matchAbs)

# total awarded money
matchAmt=re.findall(r'Total\s+Amt\.\s+\:\s+\$(\d+)',mytext)[0]

if matchOrg not in orgAwards:
orgAwards[matchOrg]=[]
orgAwards[matchOrg].append(int(matchAmt))

for each in capturedorgs:
print(each,"\n")
for each in capturedfiles:
print(each,"\n")
for each in capturedabstracts:
print (each,"\n")

# add code to print what is in your other two lists
from collections import Counter
countOrg=Counter(capturedorgs)
print (countOrg)

for each in orgAwards:
print(each,sum(orgAwards[each]))

错误信息:

Traceback (most recent call last):
File "C:\Python32\Assignment1.py", line 17, in <module>
matchOrg=re.findall(r'NSF\s+Org\s+\:\s+(\w+)',mytext)[0]
IndexError: list index out of range

最佳答案

如果findall没有找到匹配项,它将返回一个空列表[];当您尝试从此空列表中获取第一项时会发生错误,从而导致异常:

>>> import re
>>> i = 'hello'
>>> re.findall('abc', i)
[]
>>> re.findall('abc', i)[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range

要确保您的代码在找不到匹配项时不会停止,您需要捕获引发的异常:

try:
matchOrg=re.findall(r'NSF\s+Org\s+\:\s+(\w+)',mytext)[0]
capturedorgs.append(matchOrg)
except IndexError:
print('No organization match for {}'.format(filepath))

您必须为每个 re.findall 语句执行此操作。

关于python - 为什么会出现 "List index out of range"错误?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22251945/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com