gpt4 book ai didi

vb.net - 如何从 HTML 表格中提取数据?

转载 作者:行者123 更新时间:2023-12-02 02:07:25 26 4
gpt4 key购买 nike

我最近下载了 HtmlAgilityPack,但我还没有找到任何关于如何使用它的真正说明。我试图根据一些不同的讨论板帖子和其他来源拼凑一些代码。这是我目前所拥有的:

Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs)
Dim document As New HtmlAgilityPack.HtmlDocument
document.LoadHtml("www.reuters.com/finance/stocks/overview?symbol=GOOG")

Dim tabletag = document.DocumentNode.SelectSingleNode("//table[@class='data']/tr[1]/td[2]")
End Sub

如您所见,我正在使用 www.reuters.com/finance/stocks/overview?symbol=GOOG 中的 HTML。

我正在尝试从此页面中提取 Beta 值。该值当前为 1.04。

当我运行上面的代码时,我的即时窗口会重复显示 100 次:

1.04
$243,156.41
328.59
--
--
Trading Report for (GOOG). A detailed report, including free correlated market analysis, and updates.
ValuEngine Detailed Valuation Report for GOOG
GOOGLE INC CL A (GOOG) 12-months forecast
GOOGLE INC CL A (GOOG) 2-weeks forecast
Google Inc: Business description, financial summary, 3yr and interim financials, key statistics/ratios and historical ratio analysis.

我只想返回第一个数字 (1.04)。我究竟做错了什么?有什么建议吗?

最佳答案

您需要使用 cookie 和代理。以下对我很有用。让我知道你的想法:

Imports System.Net
Imports System.Web

Public Class Form1

Public cookies As New CookieContainer

Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click


Dim wreq As HttpWebRequest = WebRequest.Create("http://www.reuters.com/finance/stocks/overview?symbol=GOOG")

wreq.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5"

wreq.Method = "get"

Dim prox As IWebProxy = wreq.Proxy

prox.Credentials = CredentialCache.DefaultCredentials


Dim document As New HtmlAgilityPack.HtmlDocument
Dim web As New HtmlAgilityPack.HtmlWeb

web.UseCookies = True
web.PreRequest = New HtmlAgilityPack.HtmlWeb.PreRequestHandler(AddressOf onPreReq)

wreq.CookieContainer = cookies

Dim res As HttpWebResponse = wreq.GetResponse()


document.Load(res.GetResponseStream, True)

'just for testing:
' Dim tabletag = document.DocumentNode.SelectNodes("//table")
' MsgBox(tabletag.Nodes.Count.ToString)

'returns your field
Dim tabletag2 = document.DocumentNode.SelectSingleNode("//td[@class='data']")
MsgBox(tabletag2.InnerText)

End Sub

Private Function onPreReq(req As HttpWebRequest)

req.CookieContainer = cookies
Return True

End Function
End Class

关于vb.net - 如何从 HTML 表格中提取数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14322216/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com