gpt4 book ai didi

python - 列表索引超出范围错误: webscraping with Beautifoul Soup

转载 作者:行者123 更新时间:2023-12-01 01:17:36 26 4
gpt4 key购买 nike

我正在尝试从 this website 抓取实时出发表使用 BeautifulSoup 。

我尝试过以下方法:

caremar_live_departures_table = list(soup.select('.table-booking-history tr'))
caremar_live_departures_data = []
for tr in caremar_live_departures_table:
td = tr.select('td')
caremar_live_departures_data.append({
'DEPARTURE PORT': td[1].select('span span').text,
'ARRIVAL PORT': td[2].select('span span').text,
'DEPARTURE TIME': td[4].select('span').text,
'ARRIVAL TIME': td[6].select('span').text,
'FEERY TYPE': td[3].select('span span').text,
'STATUS': td[3].select('span span').text
})

我收到以下错误:

 'DEPARTURE PORT': td[1].select('span span').text,
IndexError: list index out of range

td应该是一个数组,为什么不是这样?

最佳答案

我查看了源代码,并非表中的每个 tr 都有您要查找的数据。如果您仅观察类 r1、r2 等,则已获得您需要的数据。有些只有一个 td。因此,只有 td[0] 可用。这就是为什么你会得到 IndexError

此外,我认为您的列表索引可能错误。我已尽力修复它。

import requests
from bs4 import BeautifulSoup
r=requests.get('https://shop.caremar.it/it/prossime-partenze/')
soup=BeautifulSoup(r.text,'html.parser')
caremar_live_departures_table = list(soup.select('.table-booking-history tr[class*="r"]'))
caremar_live_departures_data = []
for tr in caremar_live_departures_table:
td = tr.select('td')
caremar_live_departures_data.append({
'DEPARTURE PORT': td[0].text.strip(),
'ARRIVAL PORT': td[1].text.strip(),
'DEPARTURE TIME': td[3].text.strip(),
'ARRIVAL TIME': td[5].text.strip(),
'FEERY TYPE': td[2].text.strip(),
'STATUS': td[6].text.strip()
})
print(caremar_live_departures_data)

输出

[{'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Ischia', 'DEPARTURE TIME': '23:00', 'ARRIVAL TIME': '23:30', 'FEERY TYPE': 'Traghetto', 'STATUS': 'Chiuso'}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '02:30', 'ARRIVAL TIME': '02:45', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Pozzuoli', 'DEPARTURE TIME': '02:30', 'ARRIVAL TIME': '03:30', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Pozzuoli', 'DEPARTURE TIME': '03:10', 'ARRIVAL TIME': '03:30', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Pozzuoli', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '04:10', 'ARRIVAL TIME': '05:10', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Pozzuoli', 'ARRIVAL PORT': 'Ischia', 'DEPARTURE TIME': '04:10', 'ARRIVAL TIME': '05:40', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Ischia', 'DEPARTURE TIME': '04:40', 'ARRIVAL TIME': '05:40', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Napoli (Porta di Massa)', 'ARRIVAL PORT': 'Capri', 'DEPARTURE TIME': '05:35', 'ARRIVAL TIME': '06:25', 'FEERY TYPE': 'TMV', 'STATUS': ''}, {'DEPARTURE PORT': 'Napoli (Porta di Massa)', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '06:15', 'ARRIVAL TIME': '07:15', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Napoli (Porta di Massa)', 'ARRIVAL PORT': 'Ischia', 'DEPARTURE TIME': '06:15', 'ARRIVAL TIME': '07:55', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Napoli (Molo Beverello)', 'DEPARTURE TIME': '06:35', 'ARRIVAL TIME': '07:05', 'FEERY TYPE': 'Aliscafo', 'STATUS': ''}, {'DEPARTURE PORT': 'Capri', 'ARRIVAL PORT': 'Napoli (Porta di Massa)', 'DEPARTURE TIME': '06:40', 'ARRIVAL TIME': '08:00', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '06:45', 'ARRIVAL TIME': '07:00', 'FEERY TYPE': 'Aliscafo', 'STATUS': ''}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Napoli (Molo Beverello)', 'DEPARTURE TIME': '06:45', 'ARRIVAL TIME': '07:50', 'FEERY TYPE': 'Aliscafo', 'STATUS': ''}, {'DEPARTURE PORT': 'Capri', 'ARRIVAL PORT': 'Sorrento', 'DEPARTURE TIME': '07:00', 'ARRIVAL TIME': '07:25', 'FEERY TYPE': 'TMV', 'STATUS': ''}, {'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Napoli (Molo Beverello)', 'DEPARTURE TIME': '07:10', 'ARRIVAL TIME': '07:50', 'FEERY TYPE': 'Aliscafo', 'STATUS': ''}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '07:20', 'ARRIVAL TIME': '07:50', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Ischia', 'ARRIVAL PORT': 'Pozzuoli', 'DEPARTURE TIME': '07:20', 'ARRIVAL TIME': '08:30', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Procida', 'ARRIVAL PORT': 'Ischia', 'DEPARTURE TIME': '07:25', 'ARRIVAL TIME': '07:55', 'FEERY TYPE': 'Traghetto', 'STATUS': ''}, {'DEPARTURE PORT': 'Napoli (Molo Beverello)', 'ARRIVAL PORT': 'Procida', 'DEPARTURE TIME': '07:30', 'ARRIVAL TIME': '08:05', 'FEERY TYPE': 'Aliscafo', 'STATUS': ''}]

关于python - 列表索引超出范围错误: webscraping with Beautifoul Soup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54173255/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com