gpt4 book ai didi

Python 请求 422 post 错误

转载 作者:行者123 更新时间:2023-11-28 17:07:14 24 4
gpt4 key购买 nike

我一直在尝试抓取像 GitHub 这样需要登录身份验证的网站,但与 Github 不同的是,它没有 API。我关注了these说明和许多其他说明,但似乎没有任何效果,只是返回 422 错误。

from lxml import html

url = "https://github.com/login"
user = "my email"
pas = "associated password"

sess = requests.Session()
r = sess.get(url)

rhtml = html.fromstring(r.text)

#get all hidden input fields and make a dict of them
hidden = rhtml.xpath(r'//form//input[@type="hidden"]')
form = {x.attrib["name"]: x.attrib["value"] for x in hidden}

#add login creds to the dict
form['login'] = user
form['password'] = pas

#post
res = sess.post(url, data=form)

print(res)
# <Response [422]>

我也试过 sess.post(url, data={'login':user, 'password':pas}) 得到了同样的结果。 获取先获取 cookie 并在帖子中使用它们似乎也不起作用。

如何获得我的登录页面,最好不使用 Selenium?

最佳答案

那是因为表单action与登录页面不同。

这是使用 requestsBeautifulSoup 实现的方法:

import requests
from bs4 import BeautifulSoup

url = "https://github.com/login"
user = "<username>"
pwd = "<password>"

with requests.Session() as s:

r = s.get(url)
soup = BeautifulSoup(r.content, "lxml")

hidden = soup.find_all("input", {'type':'hidden'})
target = "https://github.com" + soup.find("form")['action']
payload = {x["name"]: x["value"] for x in hidden}

#add login creds to the dict
payload['login'] = user
payload['password'] = pwd

r = s.post(target, data=payload)
print(r)

关于Python 请求 422 post 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50261869/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com