gpt4 book ai didi

python - 数据没有被正确抓取

转载 作者:太空宇宙 更新时间:2023-11-03 20:41:05 24 4
gpt4 key购买 nike

尝试使用Scrapy抓取以下网页,https://www2.trollandtoad.com/buylist/?_ga=2.123753418.115346513.1562026676-1813285172.1559913561#!/M/10591 ,并且我正确抓取了部分数据,但无法正确抓取卡名称,因为它的选择器与集合名称相同,所以我也只获取卡名称的集合名称。

 def parse(self, response):
# Initialize item to function GameItem located in items.py, will be called multiple times
item = GameItem()
# Extract card category from URL using html code from website that identifies the category. Will be outputted before rest of data
for data in response.css("tr.ng-scope"):
item["Set"] =data.css("a.ng-binding.ng-scope::text").get()
if item["Set"] == None:
item["Set"] = data.css("span.ng-binding.ng-scope::text").get()
item["Card_Name"] = data.css("a.ng-binding.ng-scope::text").get()
# Call item again in order to extract the condition, stock, and price using the corresponding html code from the website
item["Condition"] = data.css("td\.5557170.buylist_condition::text").get()
item["Quantity"] = data.css("span.ng-binding::text").get()
item["Price"] = data.css("span.ng-binding::text").get()

更新#1

我使用 xpath 代替,并且能够获取卡名称而不是设置名称,但它为每一行返回相同的卡名称,而不是不同的卡名称。

item["Card_Name"] = data.xpath("/html/body/div[2]/div[2]/div[1]/table[1]/tbody/tr[1]/td[2]/a/text()").get()

最佳答案

card_names = response.xpath("//div/table/tbody/tr/td[contains(@class,'buylist_productname item')]/a/text()").getall()

将根据页面中的顺序返回不同卡片名称的列表。

关于python - 数据没有被正确抓取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56858673/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com