gpt4 book ai didi

java - 如何使用Jsoup获取html数据的特定子元素

转载 作者:行者123 更新时间:2023-12-01 17:22:30 25 4
gpt4 key购买 nike

所以我尝试使用 Jsoup 从 Html 文件中获取所有价格。简化的 Html 结构如下:

//some html

<div class="price-point-wrap use-roundtrippricing">
<div class="price-point-wrap-top use-roundtrippricing">


<div class="pp-from-total use-roundtrippricing">Roundtrip</div>
</div>
<div class="price-point price-point-revised use-roundtrippricing">
$509
</div>

<div class="fare-select-button-div">
<input type="button" aria-describedby="sr_product_ECONOMY_123-745|1975-UA" value="Select" class="fare-select-button">
<span class="visuallyhidden">fare for Economy (lowest)</span>
</div>

</div>

//some html

<div class="price-point-wrap use-roundtrippricing">
<div class="price-point-wrap-top use-roundtrippricing">


<div class="pp-from-total use-roundtrippricing">Roundtrip</div>
</div>
<div class="price-point price-point-revised use-roundtrippricing">
$1,046
</div>

<div class="fare-select-button-div">
<input type="button" aria-describedby="sr_product_MIN-BUSINESS-OR-FIRST_123-745|1975-UA" value="Select" class="fare-select-button">
<span class="visuallyhidden">fare for First (2-cabin, lowest)</span>
</div>

<div class="pp-remaining-seats">​5 tickets left at this price​</div>
</div>

//some html

这是我迄今为止尝试过的:

File input = new File("Flights.html");
Document document = Jsoup.parse(input, "UTF-8", "");
Elements prices = document.getElementsByClass("price-point");
for(Element e: prices){
System.out.println(e.toString());
}

这给了我以下结果:

<div class="price-point price-point-revised use-roundtrippricing">
$509
</div>
<div class="price-point price-point-revised use-roundtrippricing">
$1,046
</div>
.....

但现在我只想要这样的价格:

509
1046

我尝试使用正则表达式,在打印时仅保留数字 e.toString().replaceAll("\\D+","") ,这似乎有效,但这不是我想要的来实现它。如何使用 Jsoup 只获取数字?

最佳答案

感谢@Eritrean的评论,我需要使用e.text()而不是e.toString()这给了我

$509 
$1,046

我仍然需要使用像 e.replaceAll("[$,]", "") 这样的正则表达式来去掉美元符号。

关于java - 如何使用Jsoup获取html数据的特定子元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61268080/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com