gpt4 book ai didi

html - 在python中使用模式时无法获取网站名称

转载 作者:行者123 更新时间:2023-11-28 02:22:57 27 4
gpt4 key购买 nike

尝试从网站获取名称时代码中存在错误,但在获取金额时却完美地给出了金额

这是尝试获取人员数量时的代码:

import requests
import re
from pattern import web
import pandas as pd
def list_of_prices(url):
html = requests.get(url).text
dom = web.DOM(html)
list = []
for person in dom('.freelancer-list-item .medium.price-tag'):
amount = person('span')
list.append([amount[0].content if amount else 'na'])
return list
list_of_prices('https://www.peopleperhour.com/freelance/data+analyst?page=1')

结果如下:

[[u'$20<small>PER HOUR</small>'],
[u'$20<small>PER HOUR</small>'],
[u'$68<small>PER HOUR</small>'],
[u'$45<small>PER HOUR</small>'],
[u'$38<small>PER HOUR</small>'],
[u'$61<small>PER HOUR</small>'],
[u'$20<small>PER HOUR</small>'],
[u'$34<small>PER HOUR</small>'],
[u'$35<small>PER HOUR</small>'],
[u'$14<small>PER HOUR</small>'],
[u'$27<small>PER HOUR</small>'],
[u'$47<small>PER HOUR</small>'],
[u'$40<small>PER HOUR</small>'],
[u'$12<small>PER HOUR</small>'],
[u'$15<small>PER HOUR</small>'],
[u'$61<small>PER HOUR</small>'],
[u'$68<small>PER HOUR</small>'],
[u'$15<small>PER HOUR</small>'],
[u'$14<small>PER HOUR</small>'],
[u'$25<small>PER HOUR</small>']]

如何在这个输出中删除

这是我试图获取名称的代码:

import requests
import re
from pattern import web
import pandas as pd
def list_of_names(url):
html = requests.get(url).text
dom = web.DOM(html)
list = []
for person in dom ('.freelancer-list-item .freelancer__name crop'):
title = person('a.link')
list.append([title[0].content if title else 'na'])
return list
list_of_names('https://www.peopleperhour.com/freelance/data+analyst?page=1')

但它没有获取名称并显示错误:

   ---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
<ipython-input-36-77ae0c541f2d> in <module>()
11 list.append([title[0].content if title else 'na'])
12 return list
---> 13 list_of_names('https://www.peopleperhour.com/freelance/data+analyst?page=1')

<ipython-input-36-77ae0c541f2d> in list_of_names(url)
9 for person in dom ('.freelancer-list-item .freelancer__name crop'):
10 title = person('a.link')
---> 11 list.append([title[0].content if title else 'na'])
12 return list
13 list_of_names('https://www.peopleperhour.com/freelance/data+analyst?page=1')

UnboundLocalError: local variable 'title' referenced before assignment

如何解决这个错误。请帮忙

谢谢!

最佳答案

虽然我不熟悉pattern的用法,但我建议你试试下面的。这应该有效:

import requests
from pattern import web

page_link = "https://www.peopleperhour.com/freelance/data+analyst?page=1"

def list_of_names(url):
html = requests.get(url).text
dom = web.DOM(html)
list_item = []
for person in dom('.freelancer__info'):
title = person('.link')
list_item.append([title[0].content if title else 'na'])
return list_item

if __name__ == '__main__':
print(list_of_names(page_link))

关于html - 在python中使用模式时无法获取网站名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48102898/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com