gpt4 book ai didi

python - 属性错误: 'list' object has no attribute 'extract' ?

转载 作者:太空宇宙 更新时间:2023-11-03 17:15:58 25 4
gpt4 key购买 nike

我只想通过 xpath 从这个 url( http://www.tuniu.com/g3300/whole-nj-0/list-l1602-h0-i-j0_0/ ) 中提取信息。当我运行以下代码时,出现 AttributeError: 'list' object has no attribute 'extract'?我的模块导入是否错误或不匹配?

# -*- coding: utf-8 -*-

import urllib2
import sys
import lxml.html as HTML
reload(sys)
sys.setdefaultencoding("utf-8")


class spider(object):
def __init__(self):
print u'开始爬取内容'

def getSource(self, url):
html = urllib2.Request(url)
pageContent = urllib2.urlopen(html,timeout=60).read()
return pageContent

def getUrl(self, pageContent):
htmlSource = HTML.fromstring(pageContent)
urlInfo = htmlSource.xpath('//dd[@class="tqs"]/span/a/@href').extract()[0]
return urlInfo


if __name__ == "__main__":
url = "http://www.tuniu.com/g3300/whole-nj-0/list-l1602-h0-i-j0_0/"
tuniu = spider()
tuniu.getUrl(url)

以下错误!

 Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\anzhuang\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "D:\anzhuang\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "D:/python/tuniu2/tuniu.py", line 34, in <module>
tuniu.getUrl(url)
File "D:/python/tuniu2/tuniu.py", line 27, in getUrl
urlInfo = htmlSource.xpath('//dd[@class="tqs"]/span/a/@href').extract()[0]
AttributeError: 'list' object has no attribute 'extract'

最佳答案

首先,使用 url 调用 getUrl。它不会获取 url 的内容。修改它以获取页面内容。

并且不需要extract。要获取 href,只需从返回的列表中获取一个项目即可。

def getUrl(self, url):
pageContent = self.getSource(url) # <---
htmlSource = HTML.fromstring(pageContent)
urlInfo = htmlSource.xpath('//dd[@class="tqs"]/span/a/@href')[0]
return urlInfo

关于python - 属性错误: 'list' object has no attribute 'extract' ?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33668480/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com