gpt4 book ai didi

html - VBA - Google 新闻搜索结果的数量

转载 作者:行者123 更新时间:2023-12-04 22:30:48 24 4
gpt4 key购买 nike

我有一个单元格,其中包含我想在谷歌新闻中搜索的内容。我希望代码返回该搜索的结果数。目前我有这个代码,我在网站的其他地方找到了这个代码,不使用谷歌新闻,但即便如此,我有时也会得到一个

runtime error -2147024891 (80070005)



经过70左右的搜索,我无法再次运行。
Sub HawkishSearch()

Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object
Dim start_time As Date
Dim end_time As Date

lastRow = Range("B" & Rows.Count).End(xlUp).Row

Dim cookie As String
Dim result_cookie As String

start_time = Time
Debug.Print "start_time:" & start_time

For i = 2 To lastRow

url = "https://www.google.co.in/search?q=" & Cells(i, 2) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send

Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText

If html.getElementById("resultStats") Is Nothing Then
str_text = "0 Results"
Else
str_text = html.getElementById("resultStats").innerText
End If
Cells(i, 3) = str_text
DoEvents
Next

end_time = Time
Debug.Print "end_time:" & end_time

Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub

最佳答案

最佳选择 (IMO) 是使用 Google News API并注册一个 API key 。然后,您可以使用包含您的搜索词的 queryString 并解析 JSON 响应以获取结果计数。我在下面执行此操作,并使用文章标题和链接填充集合。我使用名为 JSONConverter.bas 的 JSON 解析器您下载并添加到您的项目中。然后,您可以转到 VBE > 工具 > 引用 > 添加对 Microsoft 脚本运行时的引用。

来自 API 的 JSON 响应示例:

enter image description here
{}表示您可以通过键访问的字典,[]表示您通过索引或 For Each 访问的集合循环过去。

我使用 key totalResults从 API 返回的初始字典中检索总结果计数。

然后我循环字典(文章)的集合并提取故事标题和 URL。

然后,您可以在本地窗口中检查结果或打印出来

本地窗口中的结果示例:

enter image description here

Option Explicit

Public Sub GetStories()
Dim articles As Collection, article As Object
Dim searchTerm As String, finalResults As Collection, json As Object, arr(0 To 1)
Set finalResults = New Collection
searchTerm = "Obama"

With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://newsapi.org/v2/everything?q=" & searchTerm & "&apiKey=yourAPIkey", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
Set json = JsonConverter.ParseJson(.responseText)
End With

Debug.Print "total results = " & json("totalResults")

Set articles = json("articles")
For Each article In articles
arr(0) = article("title")
arr(1) = article("url")
finalResults.Add arr
Next

Stop '<== Delete me later

End Sub

环形:

如果在循环中部署,您可以使用类 clsHTTP保存 XMLHTTP 对象。这比创建和销毁更有效。我为这个类提供了一个方法 GetString从 API 检索 JSON 响应,以及 GetInfo方法来解析 JSON 并检索结果计数以及 API 结果 URL 和标题。

本地窗口中的结果结构示例:

enter image description here

类 clsHTTP:
Option Explicit   
Private http As Object

Private Sub Class_Initialize()
Set http = CreateObject("MSXML2.XMLHTTP")
End Sub

Public Function GetString(ByVal url As String) As String
With http
.Open "GET", url, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
GetString = .responseText
End With
End Function

Public Function GetInfo(ByVal json As Object) As Variant
Dim results(), counter As Long, finalResults(0 To 1), articles As Object, article As Object

finalResults(0) = json("totalResults")
Set articles = json("articles")

ReDim results(1 To articles.Count, 1 To 2)

For Each article In articles
counter = counter + 1
results(counter, 1) = article("title")
results(counter, 2) = article("url")
Next

finalResults(1) = results
GetInfo = finalResults
End Function

标准模块:
Option Explicit

Public Sub GetStories()
Dim http As clsHTTP, json As Object
Dim finalResults(), searchTerms(), searchTerm As Long, url As String
Set http = New clsHTTP

With ThisWorkbook.Worksheets("Sheet1")
searchTerms = Application.Transpose(.Range("A1:A2")) '<== Change to appropriate range containing search terms
End With

ReDim finalResults(1 To UBound(searchTerms))

For searchTerm = LBound(searchTerms, 1) To UBound(searchTerms, 1)

url = "https://newsapi.org/v2/everything?q=" & searchTerms(searchTerm) & "&apiKey=yourAPIkey"

Set json = JsonConverter.ParseJson(http.GetString(url))

finalResults(searchTerm) = http.GetInfo(json)

Set json = Nothing

Next

Stop '<==Delete me later
End Sub

'

否则:

我将使用以下内容,通过他们的类名获取故事链接。我得到计数​​并将链接写入集合
Option Explicit

Public Sub GetStories()
Dim sResponse As String, html As HTMLDocument, articles As Collection
Const BASE_URL As String = "https://news.google.com/"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://news.google.com/topics/CAAqIggKIhxDQkFTRHdvSkwyMHZNRGxqTjNjd0VnSmxiaWdBUAE?hl=en-US&gl=US&ceid=US:en", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument: Set articles = New Collection
Dim numberOfStories As Long, nodeList As Object, i As Long
With html
.body.innerHTML = sResponse
Set nodeList = .querySelectorAll(".VDXfz")
numberOfStories = nodeList.Length
Debug.Print "number of stories = " & numberOfStories
For i = 0 To nodeList.Length - 1
articles.Add Replace$(Replace$(nodeList.item(i).href, "./", BASE_URL), "about:", vbNullString)
Next
End With
Debug.Print articles.Count
End Sub

标准谷歌搜索:

以下是一个示例标准 google 搜索,但根据您的搜索词,您不会总是获得相同的 HTML 结构。您将需要提供一些失败的案例来帮助我确定是否有可以应用的一致选择器方法。
Option Explicit
Public Sub GetResultsCount()
Dim sResponse As String, html As HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.google.com/search?q=mitsubishi", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument
With html
.body.innerHTML = sResponse
Debug.Print .querySelector("#resultStats").innerText
End With

End Sub

关于html - VBA - Google 新闻搜索结果的数量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53106142/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com