gpt4 book ai didi

java - 如何使用:empty pseudo selector in jsoup

转载 作者:行者123 更新时间:2023-12-01 09:35:29 25 4
gpt4 key购买 nike

我想选择没有更多 div 或任何其他标签的 div 标签。我尝试了下面的代码,我希望输出为“这是输出”但空的伪选择器不起作用。

String htmlString = 
"<html><div><div><div><p><b>This is first line</b></p> </div><b>This is second line</b></div><div>This is output</div><div><span style=\"color:blue\">This is third line</span></div></html>"`;

org.jsoup.nodes.Document doc1 = Jsoup.parse(htmlString);

Elements elements1 = doc1.select("html:empty");

for (Element element : elements1) {
System.out.println(element.toString());
}

最佳答案

自从您发布了几个 similar questions最近,您的 html 结构发生了变化并且 css 选择器损坏了,也许避免 css 选择器并自己处理/过滤元素会更好/更适合您:

String htmlString = "<html><p><b>This has no div</b></p><div><div><div><p><b>This is first line</b></p></div><b>This is second line</b></div><div>This is output</div><div><span style=\"color:blue\">This is third line</span></div></html>";

Document doc = Jsoup.parse(htmlString);

Elements elements = doc.getAllElements();

// for all textnodes
outerloop:
for (Element element : elements) {
if(element.childNodes().size()>0 && element.childNode(0).nodeName().equals("#text")){
Element divContent = element;

if(divContent.nodeName().equals("div")){
System.out.println("No element in div; text: " + element.text()+ "\n");
}else{
while(divContent.parents().size()>0 && !divContent.parent().nodeName().equals("div")){
divContent = divContent.parent();
if(divContent.parent().nodeName().equals("body")){
continue outerloop; // continue, to skip element <p><b>This has no div</b></p>
//break; // break, if you want the element <p><b>This has no div</b></p> anyway
}
}

System.out.println("element: " + divContent.toString());
System.out.println("text: " + element.text() + "\n");
}
}
}

// only for <div>text...</div>
for (Element element : elements) {
if(element.childNodes().size()>0 && element.childNode(0).nodeName().equals("#text") && element.nodeName().equals("div")){
System.out.println("text: " + element.text());
}
}

输出:

element: <p><b>This is first line</b></p>
text: This is first line

element: <b>This is second line</b>
text: This is second line

No element in div; text: This is output

element: <span style="color:blue">This is third line</span>
text: This is third line

text: This is output

关于java - 如何使用:empty pseudo selector in jsoup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38989970/

25 4 0
文章推荐: java - 打印预览(打包不起作用)
文章推荐: actionscript-3 - 使用 Vector. 代替标准数组有什么优势吗?