gpt4 book ai didi

python - 使用 python selenium 打开多个页面

转载 作者:太空宇宙 更新时间:2023-11-03 14:23:00 25 4
gpt4 key购买 nike

我正在尝试使用 python 和 selenium 循环浏览网页列表并在每个页面上下载一个文件。我可以一次打开一个页面并使用 while 循环下载我想要的第一个文件,但是一旦我到达网页列表中的第二个元素,selenium 似乎就会出错。

这是我的代码:

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):

url = all_trails[index]
browser.get(url)

browser.find_element_by_link_text('Sign In').click()

username = browser.find_element_by_xpath("//input[@placeholder='Log
in with email']")
password = browser.find_element_by_name('pass')

username.send_keys("username")
password.send_keys("password")

browser.find_element_by_xpath("//button[@type='submit' and
@class='btn btn-primary btn-lg' and contains(text(), 'Log
In')]").click()

results_url = browser.find_element_by_xpath("//a[@class='require-
user' and contains(text(), 'GPX File')]").click()
index += 1

browser.quit()
time.sleep(5)

我可以从数组中的第一个元素(即 www.google.com)下载文件。循环到达第二个列表元素www.yahoo.com,但一旦到达browser.get(url),我就会遇到此错误:

Traceback (most recent call last):
File "trails_scraper.py", line 22, in <module>
browser.get(url)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in get
self.execute(Command.GET, {'url': url})
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 306, in execute
response = self.command_executor.execute(driver_command, params)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 460, in execute
return self._request(command_info[0], url, body=data)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 483, in _request
self._conn.request(method, parsed_url.path, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 575, in create_connection
raise err
socket.error: [Errno 61] Connection refused

有人知道这是怎么回事吗?我知道更容易出错的方法是使用 for 循环,但从逻辑上讲,我的代码似乎是正确的。

任何帮助将不胜感激:)

最佳答案

所以问题是您正在声明浏览器退出循环,因此,当循环完成一次时,它会关闭浏览器,如果您的浏览器失败

browser.get(url)

因为有浏览器。

你有2个解决方案:

1) 在循环内引入浏览器声明

path_to_chromedriver = 'path to chromedriver location'


#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

url = all_trails[index]
browser.get(url)

browser.find_element_by_link_text('Sign In').click()

username = browser.find_element_by_xpath("//input[@placeholder='Log
in with email']")
password = browser.find_element_by_name('pass')

username.send_keys("username")
password.send_keys("password")

browser.find_element_by_xpath("//button[@type='submit' and
@class='btn btn-primary btn-lg' and contains(text(), 'Log
In')]").click()

results_url = browser.find_element_by_xpath("//a[@class='require-
user' and contains(text(), 'GPX File')]").click()
index += 1

browser.quit()
time.sleep(5)

2)循环结束后关闭浏览器

path_to_chromedriver = 'path to chromedriver location'
browser = webdriver.Chrome(executable_path = path_to_chromedriver)

browser.get("file:///path to html file")

#these are example webpages
all_trails = ['www.google.com', 'www.yahoo.com', 'www.bing.com']

index = 0

while (index <= 2):

url = all_trails[index]
browser.get(url)

browser.find_element_by_link_text('Sign In').click()

username = browser.find_element_by_xpath("//input[@placeholder='Log
in with email']")
password = browser.find_element_by_name('pass')

username.send_keys("username")
password.send_keys("password")

browser.find_element_by_xpath("//button[@type='submit' and
@class='btn btn-primary btn-lg' and contains(text(), 'Log
In')]").click()

results_url = browser.find_element_by_xpath("//a[@class='require-
user' and contains(text(), 'GPX File')]").click()
index += 1
time.sleep(5)
browser.quit()

关于python - 使用 python selenium 打开多个页面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47828773/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com