gpt4 book ai didi

html - 使用css选择器excel vba从网站抓取数据

转载 作者:行者123 更新时间:2023-12-02 21:09:34 27 4
gpt4 key购买 nike

我正在尝试使用 CSS 选择器从网站上抓取特定数据。我在 QHar 的帮助下成功了,但现在的要求已经改变了。这是我的代码如下:

代码

Public Sub CompanyData2()

Dim html As HTMLDocument, ws As Worksheet, re As Object

Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\s{2,}"
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New HTMLDocument

With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.bizi.si/iskanje?q=", False
.send
html.body.innerHTML = .responseText
End With

ws.Range("A4").Value = re.Replace(Join$(Array(html.querySelector("td.item a").innerText), ", "), Chr$(32))
ws.Range("A5").Value = re.Replace(Join$(Array(html.querySelector("td.item + td.item").innerText), ", "), Chr$(32))
ws.Range("B6").Value = re.Replace(Join$(Array(html.querySelector("td.item + td.item + td.item + td.item").innerText), ", "), Chr$(32))

End Sub

结果如下:

enter image description here

网站

enter image description here

我想在工作表 1 A3 上提取公司名称,如下所示:

enter image description here

谢谢。

最佳答案

您在 A1 中需要 REPROMAT,然后在发出初始查询后,您必须访问实际的公司页面以获取您显示的公司名称。如果您直接使用公司网址,则可以跳过第一个请求并使用第二个请求中的代码。

Public Sub CompanyData()
Dim html As HTMLDocument, ws As Worksheet, nodes As Object

Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New HTMLDocument

With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.bizi.si/iskanje?q=" & Application.EncodeURL(ws.Range("A1").Value), False
.send
html.body.innerHTML = .responseText

Set nodes = html.querySelectorAll("td.item")

With ws
.Range("A4").Value = nodes.Item(0).FirstChild.innerText
.Range("A5").Value = nodes.Item(1).innerText
.Range("A6").Value = "DŠ: " & nodes.Item(3).innerText
End With

.Open "GET", html.querySelector("[id$=linkCompany]").href, False
.send
html.body.innerHTML = .responseText
ws.Range("A3") = html.querySelector("#ctl00_ctl00_cphMain_cphMainCol_CompanySPLPreview1_labTitlePRS").innerText
End With
End Sub

关于html - 使用css选择器excel vba从网站抓取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58899500/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com