gpt4 book ai didi

Python 和 Scrapy 引用项目的属性

转载 作者:太空宇宙 更新时间:2023-11-03 18:16:37 25 4
gpt4 key购买 nike

在 Scrapy 的教程网站上,他们有一个项目的代码。

import scrapy

class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()

然后他们就有了蜘蛛的代码。

import scrapy

from tutorial.items import DmozItem

class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]

def parse(self, response):
for sel in response.xpath('//ul/li'):
item = DmozItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
yield item

我的问题是为什么他们可以使用 [] 括号引用项目的标题?我认为当你引用一个变量时,它会是 item.title = another 。我有什么遗漏的吗?

最佳答案

这是因为 Scrapy 在底层使用 UserDict.DictMixin mixin 为 Item类:

class UserDict.DictMixin

Mixin defining all dictionary methods for classes that already have a minimum dictionary interface including __getitem__(), __setitem__(), __delitem__(), and keys().

此外,引用 Scrapy 的 documentation :

Item objects are simple containers used to collect the scraped data. They provide a dictionary-like API with a convenient syntax for declaring their available fields.

另请参阅the actual implementation .

关于Python 和 Scrapy 引用项目的属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24918603/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com