gpt4 book ai didi

python - 通过网络抓取使用帖子从网站获取结果

转载 作者:太空宇宙 更新时间:2023-11-03 21:39:08 26 4
gpt4 key购买 nike

这是我想要从中获取数据的网站的链接 Puplic Search of Trademarks

为此,我需要填写一个表单,但我想使用 Python requests 库来填写该表单。我为此编写了一些代码,请看一下:

from bs4 import BeautifulSoup
import requests,json

def returnJson(wordmark,page_class):
url="http://ipindiaonline.gov.in/tmrpublicsearch/frmmain.aspx"
search_type='WM'
postdata={'ctl00$ContentPlaceHolder1$DDLFilter':'0','ctl00$ContentPlaceHolder1$DDLSearchType':search_type,'ctl00$ContentPlaceHolder1$TBWordmark':wordmark,'ctl00$ContentPlaceHolder1$TBClass':page_class}
r=requests.post(url,data=postdata)
return r

def scrapping(r):
soup=BeautifulSoup(r.text,'html.parser')
print(soup.prettify())
'''soup.find_all('p')'''

scrapping(returnJson('AIWA','2'))

但是当我运行此代码时,它会返回与页面相同的 HTML 作为响应,但我想要搜索结果,以便可以在终端上打印它。

注意:-我已经检查了它发送的 post 请求,并根据该文件我制作了 postdata 字典。

here is screenshot of file

谁能帮我吗?

最佳答案

这篇文章需要更多的值才能发挥作用。这些可以通过首先请求页面而无需搜索来获取(如果您进行多次搜索,可能只需要一次)。例如:

from bs4 import BeautifulSoup
import requests,json

def returnJson(wordmark, page_class):
url = "http://ipindiaonline.gov.in/tmrpublicsearch/frmmain.aspx"

r_init = requests.get(url)
soup = BeautifulSoup(r_init.text, 'html.parser')
event_validation = soup.find("input", attrs={"name" : "__EVENTVALIDATION"})['value']
view_state = soup.find("input", attrs={"name" : "__VIEWSTATE"})['value']

search_type = 'WM'

postdata = {
'ctl00$ContentPlaceHolder1$DDLFilter' : '0',
'ctl00$ContentPlaceHolder1$DDLSearchType' : search_type,
'ctl00$ContentPlaceHolder1$TBWordmark' : wordmark,
'ctl00$ContentPlaceHolder1$TBClass' : page_class,
'__EVENTVALIDATION' : event_validation,
"__EVENTTARGET" : "ctl00$ContentPlaceHolder1$BtnSearch",
"__VIEWSTATE" : view_state,
}

r = requests.post(url, data=postdata)
return r

def scrapping(r):
soup = BeautifulSoup(r.text, 'html.parser')

print(soup.prettify())
'''soup.find_all('p')'''

scrapping(returnJson('AIWA','2'))

关于python - 通过网络抓取使用帖子从网站获取结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53015474/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com