gpt4 book ai didi

java - 拆分标签上的元素

转载 作者:行者123 更新时间:2023-11-30 08:06:26 26 4
gpt4 key购买 nike

如果我有一个看起来像这样的元素

<li> this is before <span class="between"> this is between </span> this is after </li>

如何使用 JSoup 获取数组 {"this is before", "this is after"}

注意:文本可以包含多个 span,但只能包含一个 Between 类。例如,

<li> 
this
<span class="other"> is </span>
before
<span class="between"> this is between </span>
this is
<span class="other"> after </span>
</li>

还应该生成 {"this is before", "this is after"}

最佳答案

您可以迭代 li 的子节点:

final String html = "<li> \n"
+ "this \n"
+ "<span class=\"other\"> is </span> \n"
+ "before \n"
+ "<span class=\"between\"> this is between </span> \n"
+ "this is \n"
+ "<span class=\"other\"> after </span> \n"
+ "</li>";

Document doc = Jsoup.parse(html);
Element li = doc.select("li").first();
List<String> text = new ArrayList<>();
StringBuilder sb = new StringBuilder();

for( Node node : li.childNodes() ) // Iterate over childnodes
{
if( node instanceof TextNode ) // Plain text
{
sb.append(node.toString());
}
else if( node instanceof Element ) // Element
{
final Element element = (Element) node;

if( element.tagName().equals("span") // Span with 'between' class
&& element.attr("class").equals("between") == true )
{
text.add(sb.toString().trim());
sb = new StringBuilder();
}
else // Every other element
{
sb.append(element.ownText());
}
}
}

text.add(sb.toString().trim());

System.out.println(text);

输出:

[this is before, this is after]

关于java - 拆分标签上的元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31032187/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com