gpt4 book ai didi

python - 网页抓取 : Xpath list index out range

转载 作者:行者123 更新时间:2023-12-01 01:28:21 27 4
gpt4 key购买 nike

当我运行以下代码时,我收到列表索引超出范围消息:

import requests
from lxml.html import fromstring

def get_values():
print('executing get_values...')
url = 'https://sports.yahoo.com/nba/stats/weekly/?sortStatId=POINTS_PER_GAME&selectedTable=0'
response = requests.get(url)
parser = fromstring(response.text)
for i in parser.xpath('//tbody/tr')[:100]:
**FGM = i.xpath('.//td[4]/span/text()')[0] #This runs with no error even though its has similar xpath.**
print('FGM: ' + FGM)
G = i.xpath('.//td[2]/span/text()')[0]
print(G)

values = get_values()

当我运行代码时,我收到以下错误消息:

 G=i.xpath('/./td[2]/span/text()')[0]
IndexError: list index out of range

我尝试使用以下语句进行调试。

print(parser.xpath('//tbody/tr/td[2]/span/text()')) #Returns list['4', '4', '3', '3', '3', '4', '4', '3', '2', '4', '3']
print(parser.xpath('//tbody/tr/td[2]/span/text()')[0]) #Returns value = 4
print(len(parser.xpath('//tbody/tr/td[2]/span/text()')[0])) # Returns value = 1

输出显示了预期值,因此我不确定它不起作用的原因。任何帮助将不胜感激!

最佳答案

它失败了,因为并不总是有 <span>在第二个<td> 。这应该有效:

def get_values():
print('executing get_values...')
url = 'https://sports.yahoo.com/nba/stats/weekly/?sortStatId=POINTS_PER_GAME&selectedTable=0'
response = requests.get(url)
parser = fromstring(response.text)
for i in parser.xpath('//tbody/tr')[:100]:
FGM = i.xpath('.//td[4]/span/text()')[0] #This runs with no error even though its has similar xpath.**
print('FGM: ' + FGM)
G = i.xpath('.//td[2]/text()|.//td[2]/span/text()')[0] # <--- Changed this
print(G)

values = get_values()

关于python - 网页抓取 : Xpath list index out range,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53141642/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com