gpt4 book ai didi

python - 超时异常 : Message:

转载 作者:行者123 更新时间:2023-12-05 06:08:03 26 4
gpt4 key购买 nike

import os   
from selenium import webdriver
import time
from linkedin_scraper import actions
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options


chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome("driver/chromedriver", options=chrome_options)
email = os.getenv("LINKEDIN_USER")
password = os.getenv("LINKEDIN_PASSWORD")

actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
driver.get('https://www.linkedin.com/company/biorasi-llc/about/')

_ = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.TAG_NAME, 'section')))

time.sleep(3)
grid = driver.find_elements_by_tag_name("section")[3]
about_us = grid.find_elements_by_tag_name("p")[0].text.strip()

print(about_us)

这是我用来抓取一家公司的 about_us 数据的代码,它可以工作,但有时我会收到如下错误:

TimeoutException Traceback(最后一次调用) 在

 17 email = os.getenv("LINKEDIN_USER")
18 password = os.getenv("LINKEDIN_PASSWORD")

---> 19 actions.login(driver, email, password) # 如果没有给出邮箱和密码,它会在终端提示

 20 driver.get('https://www.linkedin.com/company/biorasi-llc/about/')
21 _ = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.TAG_NAME, 'section')))

~\Anaconda3\lib\site-packages\linkedin_scraper\actions.py in login(driver, email, password)

 28   password_elem.submit()
29

---> 30 element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "profile-nav-item")))

~\Anaconda3\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)

 78             if time.time() > end_time:
79 break

---> 80 引发 TimeoutException(消息、屏幕、堆栈跟踪)8182 def until_not(self, 方法, message=''):

超时异常:消息:

谁能帮忙解决一下

最佳答案

可能是因为你的超时时间太短(3秒)所以在页面完全加载之前,它达到了超时阈值。尝试在第 21 行将其提高到 5-10 秒。

TIMEOUT = 10
_ = WebDriverWait(driver, TIMEOUT).until(EC.presence_of_all_elements_located((By.TAG_NAME, 'section')))

这里有一些改进代码的技巧:

  • 您已经在使用流利等待 (WebDriverWait),如果可能,尽量减少使用 time.sleepWebDriverWait 将停止等待并返回您的元素,这样可以节省时间。
  • 通过标签名称查找元素并按序号(在本例中为第 4 部分标签)定位它不是一个好主意。如果站点添加更多部分,将会中断。尝试使用更好的 XPATH,这是我的代码,我没有测试过,但我认为它会工作得很好。
import os   
from selenium import webdriver
import time
from linkedin_scraper import actions
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options


chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome("driver/chromedriver", options=chrome_options)
email = os.getenv("LINKEDIN_USER")
password = os.getenv("LINKEDIN_PASSWORD")

actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
driver.get('https://www.linkedin.com/company/biorasi-llc/about/')

# directly finds paragraph, removed time.sleep
paragraph_elem = WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.XPATH, '//section//h4/..//p')))
about_us = paragraph_elem.text.strip()

print(about_us)

关于python - 超时异常 : Message:,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65193776/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com