gpt4 book ai didi

python - 点击后抓取 .aspx 网站

转载 作者:行者123 更新时间:2023-12-01 08:08:11 25 4
gpt4 key购买 nike

我正在尝试从以下位置获取我的中队的调度数据: https://www.cnatra.navy.mil/scheds/schedule_data.aspx?sq=vt-9

我已经弄清楚如何使用 BeautifulSoup 提取数据:

import urllib2
from urllib2 import urlopen
import bs4 as bs

url = 'https://www.cnatra.navy.mil/scheds/schedule_data.aspx?sq=vt-9'
html = urllib2.urlopen(url).read()
soup = bs.BeautifulSoup(html, 'lxml')
table = soup.find('table')
print(table.text)

但是,该表格会隐藏在所选日期(如果不是当天)并按下“查看时间表”按钮的下方。

如何修改我的代码以“按下”“查看时间表”按钮,以便我可以抓取数据?如果代码还可以选择日期,那就加分了!

我尝试使用:

import urllib2
from urllib2 import urlopen
import bs4 as bs
from selenium import webdriver

driver = webdriver.Chrome("/users/base/Downloads/chromedriver")
driver.get("https://www.cnatra.navy.mil/scheds/schedule_data.aspx?sq=vt-9")
button = driver.find_element_by_id('btnViewSched')
button.click()

成功打开 Chrome 并“点击”按钮,但我无法从中抓取内容,因为地址未更改。

最佳答案

您可以使用纯selenium来获取时间表:

from selenium import webdriver

driver = webdriver.Chrome('chromedriver.exe')
driver.get("https://www.cnatra.navy.mil/scheds/schedule_data.aspx?sq=vt-9")
button = driver.find_element_by_id('btnViewSched')
button.click()
print(driver.find_element_by_id('dgEvents').text)

输出:

TYPE VT Brief EDT RTB Instructor Student Event Hrs Remarks Location
Flight VT-9 07:45 09:45 11:15 JARVIS, GRANT M [LT] LENNOX, KEVIN I [ENS] BI4101 1.5 2 HR BRIEF MASS BRIEF
Flight VT-9 07:45 09:45 11:15 MOYNAHAN, WILLIAM P [CDR] FINNERAN, MATTHEW P [1stLt] BI4101 1.5 2 HR BRIEF MASS BRIEF
Flight VT-9 07:45 12:15 13:45 JARVIS, GRANT M [LT] TAYLOR, ADAM R [1stLt] BI4101 1.5 2 HR BRIEF MASS BRIEF @ 0745 W/ JARVIS MEI OPS
Flight VT-9 07:45 12:15 13:45 MOYNAHAN, WILLIAM P [CDR] LOW, TRENTON G [ENS] BI4101 1.5 2 HR BRIEF MASS BRIEF @ 0745 W/ MOYNAHAN MEI OPS
Watch VT-9 00:00 14:00 ANDERSON, LAURA [LT] ODO (ON CALL) 14.0
Watch VT-9 00:00 14:00 ANDERSON, LAURA [LT] ODO (ON CALL) 14.0
Watch VT-9 00:00 23:59 ANDERSON, LAURA [LT] ODO (ON CALL) 24.0
Watch VT-9 00:00 23:59 ANDERSON, LAURA [LT] ODO (ON CALL) 24.0
Watch VT-9 07:00 19:00 STUY, JOHN [LTJG] DAY IWO 12.0
Watch VT-9 19:00 07:00 STRACHAN, ALLYSON [LTJG] IWO 12.0

关于python - 点击后抓取 .aspx 网站,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55428119/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com