gpt4 book ai didi

ruby - 如何获取 anchor 标记的 HREF 属性?

转载 作者:数据小太阳 更新时间:2023-10-29 08:49:32 24 4
gpt4 key购买 nike

我正在尝试从 http://expo.getbootstrap.com/ 中抓取网站

HTML 是这样的:

<div class="col-span-4">
<p>
<a class="thumbnail" target="_blank" href="https://www.getsentry.com/">
<img src="/screenshots/sentry.jpg">
</a>
</p>
</div>

我的 Nokogiri 代码是:

url = "http://expo.getbootstrap.com/"
doc = Nokogiri::HTML(open(url))
puts doc.css("title").text
doc.css(".col-span-4").each do |site|
title=site.css("h4 a").text
href = site.css("a.thumbnail")[0]['href']
end

目标很简单,得到 href , <img>标签的 href , 以及网站的 <title> , 但它不断报告:

undefined method [] for nil:NilClass 

行内:

href = site.css("a.thumbnail")[0]['href']

这真的让我发疯,因为我在这里写的代码实际上在另一种情况下工作。

最佳答案

我会做类似的事情:

require 'nokogiri'
require 'open-uri'
require 'pp'

doc = Nokogiri::HTML(open('http://expo.getbootstrap.com/'))

thumbnails = doc.search('a.thumbnail').map{ |thumbnail|
{
href: thumbnail['href'],
src: thumbnail.at('img')['src'],
title: thumbnail.parent.parent.at('h4 a').text
}
}

pp thumbnails

其中,运行后有:

# => [
{
:href => "https://www.getsentry.com/",
:src => "/screenshots/sentry.jpg",
:title => "Sentry"
},
{
:href => "http://laravel.com",
:src => "/screenshots/laravel.jpg",
:title => "Laravel"
},
{
:href => "http://gruntjs.com",
:src => "/screenshots/gruntjs.jpg",
:title => "Grunt"
},
{
:href => "http://labs.bittorrent.com",
:src => "/screenshots/bittorrent-labs.jpg",
:title => "BitTorrent Labs"
},
{
:href => "https://www.easybring.com/en",
:src => "/screenshots/easybring.jpg",
:title => "Easybring"
},
{
:href => "http://developers.kippt.com/",
:src => "/screenshots/kippt-developers.jpg",
:title => "Kippt Developers"
},
{
:href => "http://www.learndot.com/",
:src => "/screenshots/learndot.jpg",
:title => "Learndot"
},
{
:href=>"http://getflywheel.com/",
:src=>"/screenshots/flywheel.jpg",
:title=>"Flywheel"
}
]

关于ruby - 如何获取 anchor 标记的 HREF 属性?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16793611/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com