gpt4 book ai didi

html - 从文件名更改的网站下载 Excel 文件

转载 作者:行者123 更新时间:2023-12-04 19:50:28 24 4
gpt4 key购买 nike

我正在尝试从 PPAC 网站下载文件。代码有效,但是文件名每个月都会更改并且并不总是合乎逻辑的(即文件名包含 ddmmyyyy,但该日期可能是一个月中的任何一天)。

如何下​​载具有通用名称的文件?

我希望有用的一个途径是 HTML 标签——文件名显示在下面

<H5>Installed Refinery Capacity</H5>
<ul>
<li>
<a href HERE LIES MY TARGET FILE ... </a>

那么,“Installed Refinery Capacity”标题下方的要点包含我的文件名,除日期外,该文件名将保持一致。
另一种选择是遍历多个日期,直到我找到正确的日期。

Sub DownloadFile()

Dim myURL As String

'myURL = "https://www.ppac.gov.in/WriteReadData/userfiles/file/PT_installed_24-04-2020.xls?your_query_parameters"
myURL = "https://www.ppac.gov.in/WriteReadData/userfiles/file/PT_installed_24-04-2020.xls"
'myURL = Cells(10, 3)

Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False, "username", "password"
WinHttpReq.send

If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.responseBody
oStream.SaveToFile "C:\tmp\file.csv", 2 ' 1 = no overwrite, 2 = overwrite
oStream.Close
End If

End Sub

最佳答案

发出初始 XmlHttp 请求
https://www.ppac.gov.in/content/146_1_ProductionPetroleum.aspx

.responseText 读入实例化 MSHTML.HTMLDocument 变量的 html.body.innerHTML

与 css 匹配的子串 attribute = value selector (并包含运算符 *)以获得正确的 href:

Dim link As String

link = html.querySelector("[href*='PT_installed']").href

使用该链接继续您的代码。


或者作为辅助函数,比如:

Public Sub DownloadFile()

Dim myURL As String

myURL = "https://www.ppac.gov.in" & GetLink("https://www.ppac.gov.in/content/146_1_ProductionPetroleum.aspx")

Dim WinHttpReq As Object
Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
WinHttpReq.Open "GET", myURL, False, "username", "password"
WinHttpReq.send

If WinHttpReq.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq.responseBody
oStream.SaveToFile "C:\tmp\file.csv", 2 ' 1 = no overwrite, 2 = overwrite
oStream.Close
End If

End Sub

Public Function GetLink(ByVal url As String) As String
Dim xhr As Object html As MSHTML.HTMLDocument 'required VBE > Tools > References > Microsoft HTML Object Library
Set xhr = CreateObject("MSXML2.XMLHTTP")
Set html = New MSHTML.HTMLDocument

With xhr
.Open "GET", url , False
.send
html.body.innerHTML = .responseText
End With

GetLink = html.querySelector("[href*='PT_installed']").href
End Function

关于html - 从文件名更改的网站下载 Excel 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61570598/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com