gpt4 book ai didi

python - 属性错误 : 'str' object has no attribute 'xpath'

转载 作者:行者123 更新时间:2023-12-04 02:45:36 25 4
gpt4 key购买 nike

使用Python 3,Scrapy 1.7.3来以下使用以下链接 Scrapy - Extract items from table

但它给我的错误是 AttributeError: 'str' object has no attribute 'xpath'

    <table border="1" cellspacing="0" class="GridViewStyle" id="ctl00_BodyContents_subheading_gridview" rules="all" style="border-collapse:collapse;">
<tbody><tr class="GridViewHeaderStyle" style="background-color:#66B6F4;">
<th scope="col">
<span id="ctl00_BodyContents_subheading_gridview_ctl01_SUBHEADING_CODES_HEADING" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">HS-Code</span>
</th><th scope="col">
<span id="ctl00_BodyContents_subheading_gridview_ctl01_SUBHEADING_DESCRIPTION_HEADING" style="padding:20px 20px 20px 5px;font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;margin:2px">Item Description</span>
</th>
</tr><tr class="GridViewRowStyle">
<td style="width:15%;">
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl02_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td style="width:85%;">
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl02_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr><tr class="GridViewAlternatingRowStyle">
<td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl03_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl03_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr><tr class="GridViewRowStyle">
<td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl04_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl04_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr><tr class="GridViewAlternatingRowStyle">
<td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl05_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl05_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr><tr class="GridViewRowStyle">
<td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl06_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl06_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr><tr class="GridViewAlternatingRowStyle">
<td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl07_SUBHEADING_CODES" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td><td>
<a href="http://link.domain" id="ctl00_BodyContents_subheading_gridview_ctl07_SUBHEADING_DESCRIPTION" style="font-family: Helvetica Neue,Helvetica,Arial,sans-serif !important;font-size: 14px;">value1</a>
</td>
</tr>
</tbody></table>

垃圾代码

# -*- coding: utf-8 -*-
import scrapy
class CybexbotSpider(scrapy.Spider):
name = 'cybexbot'
allowed_domains = ['http://links.com']
start_urls = ['http://links.com']
def parse(self, response):
data=response.xpath('//tr[contains(@class,"GridView")]').extract()
for d in data[1:]:
print(type(d))
temp=dict()
temp['Code']=d.xpath('tr//td[1]/a/text()').extract()
temp['Desc']=d.xpath('tr//td[2]/a/text()').extract()
yield temp

创建临时字典并产生它的值

我得到的错误是

  temp['Code']=d.xpath('tr//td[1]/a/text()').extract()
AttributeError: 'str' object has no attribute 'xpath'

最佳答案

试试这个:

import scrapy
class CybexbotSpider(scrapy.Spider):
name = 'cybexbot'
allowed_domains = ['http://links.com']
start_urls = ['http://links.com']
def parse(self, response):
data=response.xpath('//tr[contains(@class,"GridView")]')
for d in data[1:]:
print(type(d))
temp=dict()
temp['Code']=d.xpath('tr//td[1]/a/text()').extract()
temp['Desc']=d.xpath('tr//td[2]/a/text()').extract()
yield temp

一旦你提取它,它就变成一个字符串,所以库不能再处理它

关于python - 属性错误 : 'str' object has no attribute 'xpath' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57417774/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com