gpt4 book ai didi

python beautifulsoup 多处理错误

转载 作者:行者123 更新时间:2023-12-01 08:57:43 24 4
gpt4 key购买 nike

import requests
from bs4 import BeautifulSoup
def display(urls):
for u in urls:
page = requests.get(u)
c = page.content
soup = BeautifulSoup(c,"html5lib")
row = soup.find_all("table",{"style":"width: 500px;"})[0].find_all('tr')
dict = {}
for i in row:
for title in i.find_all('span', attrs={
'style':'color: #008000;'}):
dict['Title'] = title.text
for link in i.find_all('a',attrs={'title':'UPSC'}, href=True):
dict['Link'] = link['href']
print(dict)


from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(display(['http://www.freejobalert.com/upsc-advt-no-18/33742/','http://www.freejobalert.com/upsc-recruitment/16960/#Engg-Services2019']))

输出和错误:

{'Title': 'Corrigendum', 'Link': 'http://www.freejobalert.com/wp-content/uploads/2018/09/Corrigendum-UPSC-Administrative-Officer-Lecturer-Posts.pdf'}
{'Title': ' Apply Online', 'Link': 'https://upsconline.nic.in/ora/VacancyNoticePub.php'}
{'Title': 'Notification ', 'Link': 'http://www.freejobalert.com/wp-content/uploads/2017/09/Notification-UPSC-Administrative-Officer-Lecturer-Posts.pdf'}
{'Title': ' Official Website', 'Link': 'http://www.upsc.gov.in/ '}
{'Title': 'Apply Online', 'Link': 'https://upsconline.nic.in/upsc/mainmenu2.php'}
Traceback (most recent call last):
File "ask.py", line 94, in <module>
results = pool.map(display(['http://www.freejobalert.com/upsc-advt-no-18/33742/','http://www.freejobalert.com/upsc-recruitment/16960/#Engg-Services2019']))
TypeError: map() missing 1 required positional argument: 'iterable'

这里我在 python 中实现了多处理。但是,它给出的结果带有一些错误。

结果按预期给出,但之后出现一些错误..

最佳答案

您错误地使用了ThreadPool.map。您需要传递函数 display 和包含您的网址的列表。另外,您不需要在 display() 中使用 for 循环,因为 map 会将函数应用于列表中的每个 url。

import requests
from bs4 import BeautifulSoup

def display(url):
page = requests.get(u)
c = page.content
soup = BeautifulSoup(c,"html5lib")
row = soup.find_all("table",{"style":"width: 500px;"})[0].find_all('tr')
dict = {}
for i in row:
for title in i.find_all('span', attrs={
'style':'color: #008000;'}):
dict['Title'] = title.text
for link in i.find_all('a',attrs={'title':'UPSC'}, href=True):
dict['Link'] = link['href']
print(dict)


from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(display, ['http://www.freejobalert.com/upsc-advt-no-18/33742/', 'http://www.freejobalert.com/upsc-recruitment/16960/#Engg-Services2019'])

关于python beautifulsoup 多处理错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52688786/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com