gpt4 book ai didi

python - Selenium 网络驱动程序 : How do I get a url from an element?

转载 作者:太空宇宙 更新时间:2023-11-03 20:27:05 25 4
gpt4 key购买 nike

使用库 Selenium/Splinter 并尝试从每个元素获取 URL 以从 Wellsfargo 下载 pdf 报表。当抓取表格时,它会提供 pdf 链接 - 希望单击每个链接,然后以某种方式将它们下载到计算机上的某个位置。

    import selenium
from splinter import Browser
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By

driver = webdriver.Chrome('actual_path')
driver.get('https://www.wellsfargo.com/')
driver.delete_all_cookies

mainurl = "https://www.wellsfargo.com/"

# login function - working
username = driver.find_element_by_id("userid")
username.send_keys("actual_username")
passy = driver.find_element_by_id("password")
passy.send_keys("actual_password")
submitbutton = driver.find_element_by_xpath("""//*[@id="frmSignon"]/div[5]""")

driver.find_element_by_xpath('/html/body/div[3]/section/div[1]/div[3]/div[1]/div/div[1]/a[1]').click()
driver.implicitly_wait(sleeptime)
driver.find_element_by_link_text('View Statements').click()

################## NEED HELP -TO SAVE PDF ELEMENTS AND DOWNLOAD #############
elem = driver.find_elements_by_class_name("document-title")

counttotal = 0

for pdf in elem:
counttotal = counttotal + 1
elem[counttotal].click()
driver.back()

当尝试在 elem 中打印 for i 时: print(i) - 它打印元素但不打印 url 链接,有什么方法可以从此元素获取链接吗?

# Sample Doc To Click & Download 

<div class="documents"><div data-message-container="stmtdiscMessages"><!------------ Error messages -----------------><!----------- Account messages ---------------></div><h3>Statements</h3><p>Deposit account statements are available online for up to 7 years.</p><div class="document large"><div class="document-details account-introtext"> <a role="link" tabindex="0" data-pdf="true" data-url="https://connect.secure.wellsfargo.com/edocs/documents/retrieve/34278aaf-8f37-43de-7d8e-e368124d5f62?_x=gTHPa3PEVAvnSu-uI5vThRyJCGUu-2f4" class="document-title" style="touch-action: auto;">Statement 08/31/19 (21K, PDF)</a></div></div><div class="document large">

#document number 2
<div class="document-details account-introtext"> <a role="link" tabindex="0" data-pdf="true" data-url="https://connect.secure.wellsfargo.com/edocs/documents/retrieve/9efe2b61-8233-8s65-2738-677ef63291f7?_x=h8i20NifIc9dRVCvj9I8pkic0S80i" class="document-title" style="touch-action: auto;">Statement 07/31/19 (21K, PDF)</a></div></div><div class="document large">

#document number 3, etc.
<div class="document-details account-introtext"> <a role="link" tabindex="0" data-pdf="true" data-url="https://connect.secure.wellsfargo.com/edocs/documents/retrieve/7eece2e7-e27e-4445-8s4d-fa5899c5c96b?_x=037X7K-IdhVOVevUISRnQT74qL793tIW" class="document-title" style="touch-action: auto;">Statement 06/30/19 (24K, PDF)</a></div></div><div class="document large">

最佳答案

您可以使用 get_attribute 从元素中检索任何属性功能:

    elements = driver.find_elements_by_class_name("document-title")

pdf_urls = []
for element in elements:
pdf_urls.append(element.get_attribute('data-url'))

或者如果您习惯 list comprehensions ,这是一种更Pythonic的方式:

    elements = driver.find_elements_by_class_name("document-title")

pdf_urls = [element.get_attribute('data-url') for element in elements]

关于python - Selenium 网络驱动程序 : How do I get a url from an element?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57743538/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com