gpt4 book ai didi

html - 正则表达式通过多行 VB.Net 获取标签之间的所有内容

转载 作者:行者123 更新时间:2023-11-28 01:33:04 26 4
gpt4 key购买 nike

这是一个包含很多内容的页面,但它包含我在下面发布的 50 个 block 。

HTML block

<li>
<dl>
<dd>

<a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&amp;cc=5&amp;bl=566">




<span class="icon-frame frame-18 " style='background-image: url("http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg");'>
</span>
</a>Obtained <a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&amp;cc=5&amp;bl=566">Battering Talisman</a>.


</dd>
<dt>22 hours ago</dt>
</dl>
</li>

我现在使用的代码只搜索这一行

Obtained <a href="/wow/en/item/113987" class="color-q4" data-item="pl=100&amp;cc=5&amp;bl=566">Battering Talisman</a>.

如何让我的 MatchCollection 返回完整的 HTML block 作为 1 个匹配项?

Dim explorer As New WowExplorer(WowDotNetAPI.Region.EU, Locale.en_GB, "apikey")
Dim Request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://eu.battle.net/wow/en/character/" & Me.Realm & "/" & Me.Name & "/feed")
Dim Response As System.Net.HttpWebResponse = Request.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(Response.GetResponseStream())
Dim Sourecode As String = sr.ReadToEnd

Dim Item_ As New System.Text.RegularExpressions.Regex( _
"Obtained <a href=""/wow/en/item/.*"" class=""color-q4"".*")

Dim matche_name As MatchCollection = Item_.Matches(Sourecode)
For Each Match As Match In matche_name
Dim ItemID As String
Dim ID_Match As String = Match.Value.Split("/").GetValue(4)
ItemID = ID_Match.Split("""").GetValue(0)
Me.Items.Add(explorer.GetItem(ItemID, ItemSource))
Next

最佳答案

这是一个示例代码,展示了如何使用 XDocumentXpath 以及正则表达式(我添加了第二个 <li> 来模拟您可能拥有的 HTML)获取这些字符串:

Dim dds As List(Of String), dts As List(Of String)
dds = New List(Of String)
dts = New List(Of String)
Dim str As String = "<li> <dl> <dd> <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&amp;cc=5&amp;bl=566""> <span class=""icon-frame frame-18 "" style='background-image: url(""http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg"");'> </span> </a>Obtained <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&amp;cc=5&amp;bl=566"">Battering Talisman</a>.</dd> <dt>22 hours ago</dt> </dl> </li>"
str += "<li> <dl> <dd> <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&amp;cc=5&amp;bl=566""> <span class=""icon-frame frame-18 "" style='background-image: url(""http://media.blizzard.com/wow/icons/18/inv_misc_trinket6oih_lanternb1.jpg"");'> </span> </a>Obtained <a href=""/wow/en/item/113987"" class=""color-q4"" data-item=""pl=100&amp;cc=5&amp;bl=566"">New Talisman</a>.</dd> <dt>10 hours ago</dt> </dl> </li>"
' XPATH WAY
Dim xDoc As XDocument = XDocument.Parse("<?xml version= '1.0'?><root>" + str + "</root>")
dds = xDoc.XPathSelectElements("//dd").Select(Function(m) m.Value).ToList()
dts = xDoc.XPathSelectElements("//dt").Select(Function(m) m.Value).ToList()

' REGEX WAY
dds = New List(Of String)
dts = New List(Of String)
Dim rx As Regex = New Regex("(?s)</a>([^<]*?)<a\s[^>]*?>([^<]*?)</a>([^<\r\n]*)")
Dim matches As IEnumerable(Of Match) = rx.Matches(str).Cast(Of Match)().Select(Function(m) m)
dds = (From match In matches
Select match.Groups(1).Value + match.Groups(2).Value + match.Groups(3).Value).ToList()
Dim rxDt As Regex = New Regex("(?s)<dt>\s*([^<]*?)\s*</dt>")
Dim matches_dts As IEnumerable(Of Match) = rxDt.Matches(str).Cast(Of Match)().Select(Function(m) m)
dts = (From match In matches_dts
Select match.Groups(1).Value).ToList()

结果:

enter image description here

关于html - 正则表达式通过多行 VB.Net 获取标签之间的所有内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29684931/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com