gpt4 book ai didi

python - 在 html 表单中查找与特定选择相对应的数据

转载 作者:行者123 更新时间:2023-12-03 19:01:30 25 4
gpt4 key购买 nike

关闭。这个问题需要details or clarity .它目前不接受答案。












想改善这个问题吗?通过 editing this post 添加详细信息并澄清问题.

去年关闭。




Improve this question




我试图从位于 http://appl101.lsu.edu/booklet2.nsf/Selector2?OpenForm 的表单中抓取数据
表单的 Action 是“/booklet2.nsf/f5e6e50d1d1d05c4862584410071cd2e?CreateDocument”。对于选择的每一对(学期、部门),我们会得到一个包含数据的相应页面。
我的目标是编写一些 python 代码来找到每对(学期,部门)的页面 URL。首先,我试图找到特定选择的 URL,例如(2020 年秋季,数学)。
我是网络抓取的新手,只知道一些基本的 html。如果有人能指导我朝着正确的方向前进,我将不胜感激。另外,请详细说明此表单的操作。

最佳答案

您可以使用此示例获取每对的 URL(但大多数将返回 NoCourseDept ):

import requests
from bs4 import BeautifulSoup

base_url = 'http://appl101.lsu.edu/booklet2.nsf/Selector2?OpenForm'
post_url = 'http://appl101.lsu.edu/booklet2.nsf/f5e6e50d1d1d05c4862584410071cd2e?CreateDocument'

soup = BeautifulSoup(requests.get(base_url).content, 'lxml')

semesters = []
for s in soup.select('[name="SemesterDesc"] [value]'):
semesters.append(s['value'])

departments = []
for d in soup.select('[name="Department"] option'):
departments.append(d.get_text(strip=True))

for s in semesters:
for d in departments:
data = {
'%%Surrogate_SemesterDesc':1,
'SemesterDesc':s,
'%%Surrogate_Department': 1,
'Department':d
}
r = requests.post(post_url, data=data)
print('{:<30} {:<30} {}'.format(s, d, r.url))
打印:
...

Second Summer Module 2021 BUSINESS ADMINISTRATION https://appl101.lsu.edu/booklet2.nsf/All/FFAC316D00E5F3D5862585C7002EF1AA?OpenDocument
Second Summer Module 2021 BUSINESS EDUCATION https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 BUSINESS LAW https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CHEMICAL ENGINEERING https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CHEMISTRY https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CHILD AND FAMILY STUDIES https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CHINESE https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CIVIL ENGINEERING https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CIVIL & ENVIRONMENTAL ENGINEER https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CLASSICAL STUDIES https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 COMMUNICATION DISORDERS https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 COMMUNICATION STUDIES https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 COMPARATIVE BIOMEDICAL SCIENCE https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 COMPARATIVE LITERATURE https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 COMPUTER SCIENCE https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 CONSTRUCTION MANAGEMENT https://appl101.lsu.edu/booklet2.nsf/All/637EAD668A213EDC862585F200296FAE?OpenDocument
Second Summer Module 2021 CURRICULUM & INSTRUCTION https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 DAIRY SCIENCE https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 DIGITAL MEDIA ARTS & ENGINEERI https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 DISASTER SCIENCE MANAGEMENT https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 DOCTOR OF DESIGN https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 ECONOMICS https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform
Second Summer Module 2021 EDUC LEADERSHIP RESEARCH COUNS https://appl101.lsu.edu/booklet2.nsf/All/B0D27015A5F630CF86258602002C263E?OpenDocument
Second Summer Module 2021 EDUCATION https://appl101.lsu.edu/booklet2.nsf/NoCourseDept?readform

...

关于python - 在 html 表单中查找与特定选择相对应的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64627617/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com