- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我对 python 和 scrapy 很陌生。这是我在从亚马逊内的产品收集数据集时遇到的问题的示例代码。
from scrapy.selector import HtmlXPathSelector
from amazoncrawler.items import AmazoncrawlerItem
import scrapy
class startcrawler(scrapy.Spider):
name = "amazone"
allowed_domains = ["www.amazon.co.uk"]
start_urls = [
"http://www.amazon.co.uk/product-reviews/B005KP74BI",
]
def parse(self, response):
hxs = HtmlXPathSelector(response)
item = AmazoncrawlerItem()
reviewText = hxs.xpath('//table[@id="productReviews"]/*/*/*/*/div/div' and '//div[@class="reviewText"]/text()').extract()
ratings = hxs.xpath('//table[@id="productReviews"]/*/*/*/*/div/div' and '//span[contains(@class, "s_star")]/span/text()').extract()
for text in reviewText:
item['comment'] = text
yield item
for rating in ratings:
item['rating'] = rating
yield item
作为 csv 文件的响应:
comment,rating
And they do last quite some time too.,
"Not a lot to say about a pair of 9v batteries, but I've not had any problems with Duracell for this purpose.",
Whilst there are quite a few rechargeable 9v ones around you are better off with these as the rechargeable types are not suggested for use in devices such as this.,
Nearly didnt buy these based on two bad reviews - glad I ignored them. Its the Genuine thing with 4 batteries in the pack sold by amazon themselves.,
"They say you only get what you pay for and I am a firm believer of that and certainly in this case it is without doubt, the price of these batteries however in the high street is quite extortionate, hence this is very good value from Amazon. These batteries outlast normal batteries by at least 5-7 times as I have proved to myself several times as I use batteries for my business to power test meters and I can confirm that if you put a run of the mill relatively cheap battery in some of my meters you will be lucky to get 3 days to a week out of them, that is depending on the use of the meter.",
"I still use cheap batteries but only for the likes of wall clocks and the like that do not have a high power drain and they last a reasonable length of time, sometimes up to 2 years. A classic example of how long a cheap battery last is for example my Gillette Fusion ProGlide powered razor, a cheap battery last about a week, but a Duracell lasts at least 5-6 weeks, as I say you only get what you pay for, highly rated batteries and at this price you cannot loose.",
great value for money and its why my wee town is loosing money as their selling one for the same price.,
Great Value for Duracell batteries. I need new ones for our 4 smoke alarms in our house. We normal go for cheap ones from pound shops but they don't last more then a week. When I came across these on Amazon at this price I brought them straight away. They came as describe no problems with them all in our smoke alarms and all tested and work that's what I brought them for to do and they do the job. Ignore the negative comments previous to stop you buying. There is no problems with these batteries,
"Put these into my smoke alarms, worked fine for 18 months before the alarms started the usual chirping at 3am to let you know the battery was dying. They were replaced, but the old ones still had enough power to run one of our baby's toys more a few more months.",
Good price and good shelf life too.,
"Bought 2 packs of these batteries in March 2014 to use in PIR sensors for a wireless alarm. Batteries in the sensors generally needed to be changed annually. These batteries lasted barely 5 months, very disappointing.",
"Arrived smartly Thanks and as stated fresh cells 2016 expiry, good for my smoke and CO2 alarms, postman had to ring bell as square box shape did not fit through letter box.",
"I purchased these because I needed one for a smoke alarm - but I knew it wouldn't be long before I needed others because all my alarms were purchased at the same time. Sure enough 5 weeks later I had to change another one. When the alarm instantly gave the ""low battery"" beeps I took it out and tested it - it was well down in the ""weak"" section. Was this a factory fault? or do employees swap their flat batteries for a new one in the box? There is no seal on the box to alert anyone to such a fiddle.",
"They're batteries. They fit well the bastard smoke detectors when they start bleeping bleeping away. They still won't shut up with the new batteries, but that's the bastard smoke detector's fault, and not the battery, which works fine.",
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",4.7 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",4.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",1.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",4.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",1.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",1.0 out of 5 stars
"Whoever thought of compulsory smoke detectors, and of their general ""safety"" features, would also benefit from having a batchload of these batteries inserted in him.",5.0 out of 5 stars
我的第一个问题是,爬虫在表 ID“productReview”之外提取 3 条评论评级作为前 3 条评论评级,但当我抓取其他产品时,这是一致的。我可以忽略它,但很高兴知道如何解决这个问题。
其次,我想要的是将整个段落合并为一个,相应的评级由分隔符分隔。
comment,rating
"And they do last quite some time too.
Not a lot to say about a pair of 9v batteries, but I've not had any problems with Duracell for this purpose.
Whilst there are quite a few rechargeable 9v ones around you are better off with these as the rechargeable types are not suggested for use in devices such as this.",4.0 out of 5 stars
最佳答案
遍历表中的评论,在循环中实例化一个项目并yield
:
def parse(self, response):
reviews = response.xpath('//table[@id="productReviews"]//td/div')
for review in reviews:
item = AmazoncrawlerItem()
item['comment'] = ' '.join(review.xpath('.//div[@class="reviewText"]/text()').extract())
item['rating'] = review.xpath('.//span[contains(@class, "s_star")]/span/text()').extract()[0]
yield item
输出:
{
'comment': u"And they do last quite some time too. Not a lot to say about a pair of 9v batteries, but I've not had any problems with Duracell for this purpose. Whilst there are quite a few rechargeable 9v ones around you are better off with these as the rechargeable types are not suggested for use in devices such as this.",
'rating': u'4.0 out of 5 stars'
}
...
关于python - 如何将多个部分中的网站的多个属性映射为 scrapy 项目?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28924377/
我在使用 io-ts 时遇到一些问题。我发现它确实缺乏文档,我取得的大部分进展都是通过 GitHub issues 取得的。不,我不明白 HKT,所以没有帮助。 基本上,我在其他地方创建一个类型,ty
我必须创建一个正则表达式来搜索整个文件,以找到与 Java XML 解析器的第一部分(但不是第二部分)的匹配项。这将用于防止某些 XXE 攻击。不幸的是,它确实必须是单个正则表达式,并且它确实需要搜索
我有一些简单的 Shared/_Header.cshtml 文件中的内容。 My Shared/_Layout.cshtml 通过调用插入该代码 @Html.Partial("_Header") 目前
我有一个 if-else 语句,其中: 条件 1:ID 匹配并且自动填充某些字段。然后 if 语句只填充其余字段 条件 2:ID 不匹配,所有字段均为空白。 ELSE 语句将它们全部填充 当我使条件
我正在开发一个单页滚动网站。我正在尝试实现 ScrollMagic 并固定第一部分,以便网站的其余部分滚动到固定部分的顶部。我尝试创建一个 jsfiddle 来显示问题,但我似乎无法让 jsfiddl
这是我的情况: 我想使用 Google AdWords 的转换脚本,但出于某种原因,他们代码段的 javascript 部分在我的页面上添加了一些我似乎无法摆脱的不需要的空白。 所以我正在查看的选项纯
寻找一种优雅的方式在页面上添加一次脚本,就是这样。 我有一个需要 2 个 CSS 文件和 2 个 JS 文件的部分 View 。在大多数地方,只需要其中 1 个部分 View 。但在单个页面上,我需要
我想要一个网站,该网站始终具有相同的部分,具有相同的 id 以及我想要显示的所有内容。我对 javascript 不太了解,我想知道如何删除除特定部分之外的所有内容。 最好的方法是否是只执行一个循环来
SQL 语句教程 (11) Group By 我们现在回到函数上。记得我们用 SUM 这个指令来算出所有的 Sales (营业额)吧!如果我们的需求变成是要算出每一间店 (store_name)
我试图理解部分并认为我已经明白了。基本上,这是一种将部分应用程序应用于二元运算符的方法。所以我了解所有(2*) , (+1)等例子就好了。 但是在 O'Reilly Real World Haskel
有没有办法禁止在部分中覆盖给定的关键字参数?假设我要创建函数 bar总是有 a设置为 1 .在以下代码中: from functools import partial def foo(a, b):
我有这个使用节的 OpenMP 代码 #pragma omp parallel sections num_threads(8) { printf_s("Allo fro
我正在尝试重新创建 Apple 制作的有缺陷的 CNContactPickerViewController,因此我有一个数据数组 [CNContact],我需要将其整齐地显示在 UITableView
我有一个相对布局,其中包含一些 float 在 GridView 上的 TextView 。当我在网格中选择一个项目时,布局向下移动到屏幕的尽头,只有大约 1/5 的部分是可见的。这是使用简单的翻译动
我想在我的 tableView 中有两个部分。我希望将项目添加到第 0 节,然后能够选择一行以将其从第 0 节移动到第 1 节。到目前为止,我已将这些项目添加到第 0 节,但是当它关闭时数据不会加
我正在以自由职业者的身份开发支付控制软件,但我有一些关于 mysql 的问题。 。我有一个用作日志的表,名为“Bitacora”。在表中,我有一个名为 idCliente 的列,它是自己表中一个人的
我有一个 PFQueryTableViewController,我想向 tableview 添加部分,我这样尝试: - (PFQuery *)queryForTable { PFQuery *qu
我正在尝试编写一个查询,将部分匹配项与存储的名称值进行匹配。 我的数据库如下所示 Blockquote FirstName | Middle Name | Surname --------------
我正在开发一个语音备忘录应用程序,并且正在将文件保存到表格 View 中。我希望默认文件名显示为“新文件 1”,如果使用“新文件 1”,则它会显示为“新文件 2”,依此类推。 我正在尝试使用 do-w
我有以下简单的 HTML 布局 .section1 { background: red; } .section2 { background: green; } .section3 { ba
我是一名优秀的程序员,十分优秀!