gpt4 book ai didi

ruby-on-rails - 如何使用 nokogiri 和 mechanize 从 <script> 标签中提取文本?

转载 作者:数据小太阳 更新时间:2023-10-29 08:26:36 24 4
gpt4 key购买 nike

这是预订网站源代码的一部分:

<script>
booking.ensureNamespaceExists('env');
booking.env.b_map_center_latitude = 53.36480155016638;
booking.env.b_map_center_longitude = -2.2752803564071655;
booking.env.b_hotel_id = '35523';
booking.env.b_query_params_no_ext = '?label=gen173nr-17CAEoggJCAlhYSDNiBW5vcmVmaFCIAQGYAS64AQTIAQTYAQHoAQH4AQs;sid=e1c9e4c7a000518d8a3725b9bb6e5306;dcid=1';
</script>

我想提取 booking.env.b_hotel_id 。这样我就可以得到“25523”的值(value)。我如何使用 nokogiri 和 Mechanize 实现这一目标?

希望有人能帮忙!谢谢! :)

最佳答案

require 'mechanize'

agent = Mechanize.new
page = agent.get('http://www.booking.com/hotel/us/solera-by-stay-alfred.html?label=gen173nr-17CAEoggJCAlhYSDNiBW5vcmVmcgV1c19ueYgBAZgBMbgBBMgBBNgBAegBAfgBAg;sid=695d6598485cb1a8fd9e39c5de3878ba;dcid=4;checkin=2015-10-20;checkout=2015-10-21;dist=0;group_adults=2;room1=A%2CA;sb_price_type=total;srfid=cf5d76283b73d34a1d7e0d61cad6974e38a94351X1;type=total;ucfs=1&')

match = agent.page.search("script").text.scan(/^booking.env.b_hotel_id = \'.*\'/)
puts match
puts match[0].split("'")[1]

输出:

booking.env.b_hotel_id = '1202411'
1202411

帮助我解决这个问题的页面:

http://robdodson.me/crawling-pages-with-mechanize-and-nokogiri/

Parsing javascript function elements with nokogiri

Regular expression - starting and ending with a character string

http://www.rubular.com

关于ruby-on-rails - 如何使用 nokogiri 和 mechanize 从 &lt;script&gt; 标签中提取文本?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33202735/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com