gpt4 book ai didi

Python与selenium webscraping无法找到元素

转载 作者:太空宇宙 更新时间:2023-11-03 16:01:32 24 4
gpt4 key购买 nike

我正在尝试用 python 编写一个网络抓取,它将激活网页上某些按钮的“onclick”功能,因为包含我想要的数据的表格被转换为 csv,这使得访问变得更容易。但问题是我在使用 PhantomJs 时根本无法通过 xpath 定位元素。如何单击该元素并访问我想要的 csv 内容?

这是我的代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By

from selenium.webdriver.common.proxy import *

url = "http://www.pro-football-reference.com/boxscores/201609180nwe.htm"
xpath = "//*[@id='all_player_offense']/div[1]/div/ul/li[1]/div/ul/li[3]/button"

path_to_phantomjs = 'browser/phantomjs'
browser = webdriver.PhantomJS(executable_path = path_to_phantomjs)
browser.get(url)

delay=3
element_present = EC.presence_of_element_located((By.ID, 'all_player_offense'))
WebDriverWait(browser, delay).until(element_present)

browser.find_element_by_xpath(xpath).click()

我收到此错误:

selenium.common.exceptions.NoSuchElementException: Message: {"errorMessage":"Unable to find element with xpath '//*[@id='all_player_offense']/div[1]/div/ul/li[1]/div/ul/li[3]/button'","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"153","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:50989","User-Agent":"Python-urllib/2.7"},"httpVersion":"1.1","method":"POST","post":"{\"using\": \"xpath\", \"sessionId\": \"93ff24f0-9cbe-11e6-8711-bdfa3ff9cfb1\", \"value\": \"//*[@id='all_player_offense']/div[1]/div/ul/li[1]/div/ul/li[3]/button\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/93ff24f0-9cbe-11e6-8711-bdfa3ff9cfb1/element"}}
Screenshot: available via screen

最佳答案

IMPORTANT THING I FORGOT TO MENTION: As described in this this issue on GitHub, try putting set_window_size(width, height) or maximize_window()after setting the webdriver. You should also consider telling the webdriver to implicitly_wait(10) for the element to appear.

因此,为了让 Selenium Webdriver 正确模拟您正在做的事情,您必须执行一个特殊的操作。本质上,要获得所需的数据,您必须:

A:将鼠标悬停在“共享及更多”下拉菜单上。然后

B:点击“以 CSV (Excel) 形式获取表格”。

对于 A,这涉及必须将模拟光标放在元素上而不单击它。这种“鼠标悬停”的想法可以通过 ActionChains 类中提供的 move_to_element() 函数来实现。因此,您可以在顶部插入以下内容:

from selenium.webdriver.common.action_chains import ActionChains

您希望 Selenium 找到特定元素并移动到它。您可以通过两行代码实现此目的:

dropdown = browser.find_element_by_xpath('//*[@id="all_player_offense"]/div[1]/div/ul/li[1]')
ActionChains(browser).move_to_element(dropdown).perform()

如果省略上述内容,您将收到 ElementNotVisibleException

现在对于B,您应该能够执行browser.find_element_by_xpath(xpath).click()

关于Python与selenium webscraping无法找到元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40297647/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com