gpt4 book ai didi

html - 使用VBA从网页中提取表格

转载 作者:行者123 更新时间:2023-12-04 22:31:20 25 4
gpt4 key购买 nike

我想使用 VBA 将表格从 html 代码提取到 Excel 中。

我已经多次尝试以下代码并更改了一些代码,但不断出现错误。

Sub GrabTable()

'dimension (set aside memory for) our variables
Dim objIE As InternetExplorer
Dim ele As Object
Dim y As Integer

'start a new browser instance
Set objIE = New InternetExplorer
'make browser visible
objIE.Visible = False

'navigate to page with needed data
objIE.navigate "http://www.bursamalaysia.com/market/listed-companies/company-announcements/5923061"
'wait for page to load
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

'we will output data to excel, starting on row 1
y = 1

'look at all the 'tr' elements in the 'table' with id 'InputTable2',
'and evaluate each, one at a time, using 'ele' variable
For Each ele In objIE.document.getElementByClassName("InputTable2").getElementsByTagName("tr")
'show the text content of 'td' element being looked at
Debug.Print ele.textContent
'each 'tr' (table row) element contains 2 children ('td') elements
'put text of 1st 'td' in col A
Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent
'put text of 2nd 'td' in col B
Sheets("Sheet1").Range("B" & y).Value = ele.Children(1).textContent

y = y + 1
'repeat until last ele has been evaluated
Next


End Sub

最佳答案

我向您展示两种方法:

  • 使用 IE:数据在 iframe 内,需要协商
  • 使用 XMLHTTP 请求 - 更快且无需打开浏览器。它使用 iframe 文档 URL 的第一部分,这是 iframe 导航到的内容。

  • 在这两种情况下,我都会访问包含公司名称的表,然后是披露信息表。对于公开主要信息表,我将 outerHTML 复制到剪贴板并粘贴到 Excel 以避免循环所有行和列。您可以简单地在其中设置循环 tr(表格行)和 td(表格单元格)。

    即:
    Option Explicit
    Public Sub GetInfo()
    Dim IE As New InternetExplorer, clipboard As Object
    With IE
    .Visible = True
    .navigate "http://www.bursamalaysia.com/market/listed-companies/company-announcements/5923061"

    While .Busy Or .readyState < 4: DoEvents: Wend

    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
    With .document.getElementById("bm_ann_detail_iframe").contentDocument
    ThisWorkbook.Worksheets("Sheet1").Cells(1, 1) = .getElementsByClassName("company_name")(0).innerText
    clipboard.SetText .getElementsByTagName("table")(1).outerHTML
    clipboard.PutInClipboard
    End With

    ThisWorkbook.Worksheets("Sheet1").Cells(2, 1).PasteSpecial
    .Quit
    End With
    End Sub

    XMLHTTP:

    您可以从 iframe URL 的前端提取不同的 URL 并使用它,如下所示。

    以下是显示 iframe 和相关新 URL 信息的原始 HTML 部分:

    enter image description here
    Option Explicit
    Public Sub GetTable()
    Dim sResponse As String, html As HTMLDocument, clipboard As Object
    Set html = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "http://disclosure.bursamalaysia.com/FileAccess/viewHtml?e=2891609", False
    .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
    .send
    sResponse = StrConv(.responseBody, vbUnicode)
    End With

    html.body.innerHTML = sResponse

    With html
    ThisWorkbook.Worksheets("Sheet1").Cells(1, 1) = .querySelector(".company_name").innerText
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
    clipboard.SetText .querySelector(".InputTable2").outerHTML
    clipboard.PutInClipboard
    End With

    ThisWorkbook.Worksheets("Sheet1").Cells(2, 1).PasteSpecial

    End Sub

    关于html - 使用VBA从网页中提取表格,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52528648/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com