gpt4 book ai didi

python - 如果找不到元素或 Selenium 等待函数中发生超时异常,如何跳到下一个 url

转载 作者:行者123 更新时间:2023-12-03 08:39:05 25 4
gpt4 key购买 nike

我正在尝试从天气站点上抓取每日观测表。我有下面的代码来获取特定的表:

#Iterate request to each weather station and date
for station,month,year in product(weather_station,month,year):

areacode = weather_station[station]['areacode']

#Set link according to data need
driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))

#Wait webpage to fully load necessary tables
wait = WebDriverWait(driver, 15)

#Update xpath incase webpage html format changes
xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
tables = wait.until(EC.presence_of_all_elements_located((By.XPATH, xpath_html_loc)))

#Save only the necessary table from loaded webpage
for table in tables:
histo_table = pd.read_html(table.get_attribute('outerHTML'))
histo_weather = histo_table[2].fillna('')

print("Weather observations for ",str(month), "-", str(year)," from station",station, "is ready \n")
此代码遍历站点中所有必要的页面,并且在获取我想要的特定表时工作正常,但是当页面中不存在该表或链接不可用时,它返回此错误: timeoutexception
我读到了 try and except 选项,但在这种情况下我似乎无法让它工作。你能建议一个更好的工作解决方案吗?下面带有 try 和 except 的代码仍然输出 timeoutexception 错误。如果表元素不存在或链接不可用,我希望有一个代码跳过当前 url 并转到下一个(即返回到 for 循环的开头以迭代下一个)。
try:
#Set link according to data need
driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))

#Wait webpage to fully load necessary tables
wait = WebDriverWait(driver, 15)

#Update xpath incase webpage html format changes
xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
tables = driver.find_elements(By.XPATH, xpath_html_loc)
print(tables)
except TimeoutException as exception:
raise exception

最佳答案

我能够使用以下解决方法:

for link in links
try:
print("Trying for ",link)
#Set link according to data need
driver.get(link)
#Wait webpage to fully load necessary tables
wait = WebDriverWait(driver, 15)

#Update xpath incase webpage html format changes
xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
wait.until(EC.presence_of_all_elements_located((By.XPATH, xpath_html_loc)))
tables = driver.find_elements(By.XPATH, xpath_html_loc)
except:
# If the loading took too long, print message
print("Loading took too long! Data unavailable")
continue

if(len(tables)>0:
#Do code here
else:
print("data is unavailable")
continue
即使链接不可用或由于 try 和 except 代码而无法加载表,循环也会继续(这避免了超时异常)。我使用了 wait.until 预期条件(完全加载网页和我需要的表格)和 find_elements (定位特定表格)。如果在页面中找不到表格,或者即使在加载网页后它确实不可用,下面@Dilip 建议的 if-else 代码将继续 for 循环。
感谢你的帮助!

关于python - 如果找不到元素或 Selenium 等待函数中发生超时异常,如何跳到下一个 url,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63804694/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com