gpt4 book ai didi

html - 从网络导入数据时,如何获取带有链接的数据?

转载 作者:行者123 更新时间:2023-12-04 22:30:54 26 4
gpt4 key购买 nike

Precious @QHarr 编写的这段代码(与其他代码一样)运行良好。但是,在导入数据时,我想检索保存在连接中的数据。代码的输出和我想要接收的数据显示在附图中。我可以解决什么样的代码? (谷歌翻译)

    Public Sub DYarislar()
Dim asays(), ws As Worksheet, asay As Long, html As HTMLDocument
Dim http As clsHTTP, url As String, headers(), numberOfRequests As Long

headers = Array("Asay", "Tarih", "Sehir", "K.Cinsi", "Gr", "Msf/Pist", "Derece", "S", "Jokey", "Kilo", "G.Ç", "Hnd", "Gny", "Taki")
Set http = New clsHTTP
Set ws = ThisWorkbook.Worksheets("X")


Set html = New HTMLDocument
asays = Application.Transpose(Sheets("Y").Range("A2:A" & Sheets("Y").Columns("A:A").Find(What:="boş").Row - 1).Value)

Const numTableRows As Long = 250
Const numTableColumns As Long = 14
Const BASE_URL As String = "https://yenibeygir.com/at/"

numberOfRequests = UBound(asays)

Dim results(), headerRow As Boolean, tRows As Object, tRow As Object, iRow As Long
Dim tCells As Object, tCell As Object, r As Long, c As Long, hTable As HTMLTable
ReDim results(1 To numTableRows * numberOfRequests, 1 To numTableColumns)

Application.ScreenUpdating = False

For asay = 1 To numberOfRequests
headerRow = True
url = BASE_URL & asays(asay)
html.body.innerHTML = http.GetString(url)

Set hTable = html.querySelector(".at_Yarislar")

Set tRows = hTable.getElementsByTagName("tr")

Const numberOfRaces As Long = 22
Dim counter As Long
counter = 1
For Each tRow In tRows
If Not headerRow Then
counter = counter + 1
If counter > numberOfRaces Then Exit For
c = 2: r = r + 1
results(r, 1) = asays(asay)
Set tCells = tRow.getElementsByTagName("td")
For Each tCell In tCells
results(r, c) = tCell.innerText
c = c + 1
Next
End If
headerRow = False
Next
Next

With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
Application.ScreenUpdating = True

End Sub

Picture

最佳答案

您只需要进行一些小的更改。您使用相同的类,clsHTTP ,和以前一样,然后使用下面的模块 1 代码。

备注:

在每个源页表行中,jockey 列包含一个 a标记链接元素

enter image description here

您可以使用以下方式访问它:

tRow.getElementsByTagName("a")(1).href

由于链接是相对的,您需要进行文本替换以添加到 URL 的基本部分,即
Replace$(tRow.getElementsByTagName("a")(1).href, "about:", BASE_URL2)

id 是 href 的一部分并且可以使用 Split 提取:
Split(tRow.getElementsByTagName("a")(1).href, "/")(2)

要在结果中允许这些附加元素,您需要增加输出列数:
Const numTableColumns As Long = 16

并调整您的表格行循环以填充其他列:
results(r, 2) = Split(tRow.getElementsByTagName("a")(1).href, "/")(2) 
results(r, 3) = Replace$(tRow.getElementsByTagName("a")(1).href, "about:", BASE_URL2)

此外,在循环中进行调整以确保从第 4 列开始填充其他列(作为 2 个额外列):
c = 4

最后,调整标题以包含 2 个新列:
headers = Array("Asay", "JokeyId", "JokeyLink", "Tarih", "Sehir", "K.Cinsi", "Gr", "Msf/Pist", "Derece", "S", "Jokey", "Kilo", "G.Ç", "Hnd", "Gny", "Taki")

VBA:

模块一:
Option Explicit    
Public Sub DYarislar()
Dim asays(), ws As Worksheet, asay As Long, html As HTMLDocument
Dim http As clsHTTP, url As String, headers(), numberOfRequests As Long

headers = Array("Asay", "JokeyId", "JokeyLink", "Tarih", "Sehir", "K.Cinsi", "Gr", "Msf/Pist", "Derece", "S", "Jokey", "Kilo", "G.Ç", "Hnd", "Gny", "Taki")
Set http = New clsHTTP
Set ws = ThisWorkbook.Worksheets("X")
Set html = New HTMLDocument
asays = Application.Transpose(Sheets("Y").Range("A2:A" & Sheets("Y").Columns("A:A").Find(What:="boş").Row - 1).Value)

Const numTableRows As Long = 250
Const numTableColumns As Long = 16
Const BASE_URL As String = "https://yenibeygir.com/at/"
Const BASE_URL2 As String = "https://yenibeygir.com"
numberOfRequests = UBound(asays)

Dim results(), headerRow As Boolean, tRows As Object, tRow As Object, iRow As Long
Dim tCells As Object, tCell As Object, r As Long, c As Long, hTable As HTMLTable
ReDim results(1 To numTableRows * numberOfRequests, 1 To numTableColumns)

Application.ScreenUpdating = False

For asay = 1 To numberOfRequests
headerRow = True
url = BASE_URL & asays(asay)
html.body.innerHTML = http.GetString(url)

Set hTable = html.querySelector(".at_Yarislar")

Set tRows = hTable.getElementsByTagName("tr")

For Each tRow In tRows
If Not headerRow Then
c = 4: r = r + 1
results(r, 1) = asays(asay)
On Error Resume Next
results(r, 2) = Split(tRow.getElementsByTagName("a")(1).href, "/")(2)
results(r, 3) = Replace$(tRow.getElementsByTagName("a")(1).href, "about:", BASE_URL2)
On Error GoTo 0
Set tCells = tRow.getElementsByTagName("td")
For Each tCell In tCells
results(r, c) = tCell.innerText
c = c + 1
Next
End If
headerRow = False
Next
Next

With ws
.Cells(1, 3).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 3).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
Application.ScreenUpdating = True
End Sub

示例结果:

enter image description here

关于html - 从网络导入数据时,如何获取带有链接的数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52914664/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com