gpt4 book ai didi

java - 任何绕过 HTML 注释中的 -- 的方法

转载 作者:行者123 更新时间:2023-12-01 09:07:58 24 4
gpt4 key购买 nike

我正在使用 DocumentBuilder 将来自互联网的 xhtml(xml) 转换为 org.w3c.dom.Document,其中注释中包含“--”。有可能绕过它的方法吗?我已经设置了 setIgnoringComments 和 setValidating。

我知道 -- 不允许出现在 W3C 规范的 XML 注释中。 related posts

在约定之前对 XML 进行预处理有什么建议吗?

public static Document convertXmlStrToDocument(String xml) throws ParserConfigurationException, SAXException, IOException{
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setIgnoringComments(true);
documentBuilderFactory.setValidating(false);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(new ByteArrayInputStream(xml.getBytes()));
return document;
}

它抛出异常:

org.xml.sax.SAXParseException; lineNumber: 914; columnNumber: 17; The string "--" is not permitted within comments.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
at com.techoffice.util.XmlUtil.convertXmlStrToDocument(XmlUtil.java:41)
at com.techoffice.util.XmlUtil.evaluateXpath(XmlUtil.java:46)
at com.techoffice.jc.horse.service.web.ResultWebService.raceDateSelect(ResultWebService.java:41)
at com.techoffice.jc.horse.service.web.ResultWebServiceTest.retrieveXml(ResultWebServiceTest.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:252)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:94)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

最佳答案

不,字符串"--" must not 出现在 XML 注释中:

For compatibility, the string " -- " (double-hyphen) must not occur within comments.

这是不可配置的。任何东西都是可以修改的,但是你会违背规则并且没有 XML 解析器支持。不推荐。

尝试HTML Tidy首先清理 HTML。还有一个Java version of HTML Tidy .

关于java - 任何绕过 HTML 注释中的 -- 的方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41105604/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com