gpt4 book ai didi

python - Scrapy csv输出每列没有多行

转载 作者:太空宇宙 更新时间:2023-11-03 18:54:34 25 4
gpt4 key购买 nike

我正在尝试抓取事件网站,并且我有附加的代码来抓取事件名称和位置。我将输出写入 csv 文件,但 csv 文件将所有事件名称彼此附加在一行中。

例如,假设我有两个事件 Bruno Mars 和 Maroon 5,其地点为圣何塞、圣克拉拉。当前输出为,

事件名称事件位置

布鲁诺·马尔斯 (Bruno Mars),Maroon 5 圣何塞,圣克拉拉

但我希望看到,

事件名称事件位置

布鲁诺·马尔斯圣何塞

栗色 5 圣克拉拉。

有人可以告诉我为什么这种格式对我来说变得很奇怪吗?我已在此处附上代码。然后,我使用 scrapycrapy event_spider -o output.csv -t csv 运行我的代码。

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector

from event_test.items import EventItem


class EventSpider(BaseSpider):
name = "event_spider"
allowed_domains = ["eventful.com"]
start_urls = [
"http://eventful.com/sanjose/events"
]

def parse(self, response):
hxs = HtmlXPathSelector(response)
events = hxs.select("/html/body[@id='events']/div[@id='outer-container']/div[@id='mid-container']/div[@id='inner-container']/div[@id='content']/div[@class='cols-2-1']/div[@class='alpha']/div[@id='top-events']/div[@class='section top-events cage-dbl-border cage-bdr-mdgrey']/div[@id='events-scroll']/div[@id='events-scroll-items']/ul[@id='events-scroll-items-list']/li[@class='top-events-item ']")
items = []
for event in events:
item = EventItem()
item['event_name'] = event.select("//h2/a/span/text()").extract()
item['event_locality'] = event.select("//span[@class='locality']/text()").extract()
items.append(item)
return items

最佳答案

我简化了蜘蛛中的代码和 xpath:

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from event_test.items import EventItem


class EventSpider(BaseSpider):
name = "event_spider"
allowed_domains = ["eventful.com"]
start_urls = ["http://eventful.com/sanjose/events"]

def parse(self, response):
hxs = HtmlXPathSelector(response)
events = hxs.select("//li[contains(@class, 'top-events-item')]")
for event in events:
item = EventItem()
item['event_name'] = event.select(".//h2/a/span/text()").extract()[0]
item['event_locality'] = event.select(".//span[@class='locality']/text()").extract()[0]
yield item

以下是您将在 csv 文件中获得的内容:

event_name,event_locality
Under the Influence of Music Tour,Mountain View
Bruno Mars,San Jose
John Mayer: Born & Raised Tour 2013,Mountain View
New Kids on the Block with 98 Degrees and ...,San Jose
Amy Grant,San Jose
Styx,Saratoga
Bob Dylan with Wilco,Mountain View
Kenny Chesney with Eli Young Band,Mountain View
Smash Mouth \/ Sugar Ray \/ Gin Blossoms \...,Saratoga
Creedence Clearwater Revisited \/ 38 Special,Saratoga

希望有帮助。

关于python - Scrapy csv输出每列没有多行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17577551/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com