gpt4 book ai didi

xpath - 嵌套Xpath的Scrapy和XPath问题

转载 作者:行者123 更新时间:2023-12-03 17:12:42 25 4
gpt4 key购买 nike

我正在尝试将亚马逊产品弄得一团糟。
从使用此XPath的随机类别开始:

products = Selector(response).xpath('//div[@class="s-item-container"]')
for product in products:
item = AmzItem()
item['title'] = product.xpath('//a[@class="s-access-detail-page"]/@title').extract()[0]
item['url'] = product.xpath('//a[@class="s-access-detail-page"]/@href').extract()[0]
yield item


('//div[@class="s-item-container"]')在一个类别页面上返回所有带产品的div-是正确的。

现在,我如何获得该产品的链接?

//代表代码中的任何位置
@class的a应该选择正确的类
但我得到一个:

item['title'] = product.xpath('//a[@class="s-access-detail-page"]/@title').extract()[0]
exceptions.IndexError: list index out of range


因此,与此XPath匹配的列表必须为空-但我不明白为什么?

编辑:
HTML看起来像这样:

<div class="s-item-container" style="height: 343px;">
<div class="a-row a-spacing-base">
<div class="a-column a-span12 a-text-left">
<div class="a-section a-spacing-none a-inline-block s-position-relative">
<a class="a-link-normal a-text-normal" href="https://rads.stackoverflow.com/amzn/click/com/B0105S434A" rel="nofollow noreferrer"><img alt="Product Details" src="http://ecx.images-amazon.com/images/I/41%2BzrAY74UL._AA160_.jpg" onload="viewCompleteImageLoaded(this, new Date().getTime(), 24, false);" class="s-access-image cfMarker" height="160" width="160"></a>
<div class="a-section a-spacing-none a-text-center">
<div class="a-row a-spacing-top-mini">
<a class="a-size-mini a-link-normal a-text-normal" href="https://rads.stackoverflow.com/amzn/click/com/B0105S434A" rel="nofollow noreferrer">
<div class="a-box">
<div class="a-box-inner a-padding-mini"><span class="a-color-secondary">See more choices</span></div>
</div>
</a>
</div>
</div>
</div>
</div>
</div>
<div class="a-row a-spacing-mini">
<div class="a-row a-spacing-none">
<a class="a-link-normal s-access-detail-page a-text-normal" title="Harry Potter Gryffindor School Fancy Robe Cloak Costume And Tie (Size S)" href="https://rads.stackoverflow.com/amzn/click/com/B0105S434A" rel="nofollow noreferrer">
<h2 class="a-size-base a-color-null s-inline s-access-title a-text-normal">Harry Potter Gryffindor School Fancy Robe Cloak Costume And Tie (Size S)</h2>
</a>
</div>
<div class="a-row a-spacing-mini"><span class="a-size-small a-color-secondary">by </span><span class="a-size-small a-color-secondary">Legend</span></div>
</div>
<div class="a-row a-spacing-mini">
<div class="a-row a-spacing-none"><a class="a-size-small a-link-normal a-text-normal" href="http://www.amazon.com/gp/offer-listing/B0105S434A/ref=sr_1_21_olp?s=pet-supplies&amp;ie=UTF8&amp;qid=1435391788&amp;sr=1-21&amp;keywords=pet+supplies&amp;condition=new"><span class="a-size-base a-color-price a-text-bold">$28.99</span><span class="a-letter-space"></span>new<span class="a-letter-space"></span><span class="a-color-secondary">(1 offer)</span><span class="a-letter-space"></span><span class="a-color-secondary a-text-strike"></span></a></div>
</div>
<div class="a-row a-spacing-none"><span name="B0105S434A">
<span class="a-declarative" data-action="a-popover" data-a-popover="{&quot;max-width&quot;:&quot;700&quot;,&quot;closeButton&quot;:&quot;false&quot;,&quot;position&quot;:&quot;triggerBottom&quot;,&quot;url&quot;:&quot;/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B0105S434A&amp;contextId=search&amp;ref=acr_search__popover&quot;}"><a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star a-star-4"><span class="a-icon-alt">3.9 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a></span></span>
<a class="a-size-small a-link-normal a-text-normal" href="https://rads.stackoverflow.com/amzn/click/com/B0105S434A" rel="nofollow noreferrer">48</a>
</div>
</div>

最佳答案

//a[@class="s-access-detail-page"]必须精确地是class="s-access-detail-page",因为xpath只能用于字符串,但不能用于含义:)当您具有“ multi class”时,请使用contains函数

//a[contains(concat(' ', @class, ' '), " s-access-detail-page ")]/@title

关于xpath - 嵌套Xpath的Scrapy和XPath问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31087097/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com