gpt4 book ai didi

ruby-on-rails - Ruby Mechanize 表抓取不会捕获整行

转载 作者:行者123 更新时间:2023-12-03 16:20:17 25 4
gpt4 key购买 nike

我正在尝试使用机械化抓取表格网站。
我想刮第二排。

当我运行时:
agent.page.search('table.ea').search('tr')[-2].search('td').map{ |n| n.text }
我希望它刮整整行。但它只会抓取:[“2011-02-17”,“0,00”]

为什么不抓取行中的所有列,而只抓取第一列和最后一列?

Xpath:
/html/body/center/table/tbody/tr[2]/td[2]/table/tbody/tr[3]/td/table/tbody/tr[2]/td/table/tbody/tr[2] ]

CSS 路径:
html body center table tbody tr td table tbody tr td table tbody tr td table.ea tbody tr td.total

该页面类似于:

<table><table><table>
<table width="100%" border="0" cellpadding="0" cellspacing="1" class="ea">
<tr>
<th><a href="#">Date</a></th>
<th><a href="#">One</a></th>
<th><a href="#">Two</a></th>
<th><a href="#">Three</a></th>
<th><a href="#">Four</a></th>
<th><a href="#">Five</a></th>
<th><a href="#">Six</a></th>
<th><a href="#">Seven</a></th>
<th><a href="#">Eight</a></th>
</tr>
<tr>
<td><a href="#">2011-02-17</a></td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0,00</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">387</td>
<td align="right">0,00</td> <!-- FOV -->
<td align="right">0,00</td>
</tr>
<tr>
<td class="total">Ialt</td>
<td class="total" align="right">0</td>
<td class="total" align="right">40</td>
<td class="total" align="right">0,46</td>
<td class="total" align="right">2</td>
<td class="total" align="right">0</td>
<td class="total" align="right">0</td>
<td class="total" align="right">0</td>
<td class="total" align="right">3.060</td>
<td class="total" align="right">0,00</td>
<td class="total" align="right">18,58</td>
</tr>
</table>
</table></table></table>

最佳答案

使用以下 Ruby 代码 ( https://gist.github.com/835603 ):

require 'mechanize'
require 'pp'

a = Mechanize.new { |agent|
agent.user_agent_alias = 'Mac Safari'
}

a.get('http://binarymuse.net/table.html') do |page|
pp page.search('table.ea').search('tr')[-2].search('td').map{ |n| n.text }
end

我得到以下输出:
["2011-02-17", "0", "0", "0,00", "0", "0", "0", "0", "387", "0,00", "0,00"]

关于ruby-on-rails - Ruby Mechanize 表抓取不会捕获整行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5023740/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com