gpt4 book ai didi

java - SAXParseException;系统ID : cumulative size of entities exceeds bound

转载 作者:行者123 更新时间:2023-11-30 06:59:13 27 4
gpt4 key购买 nike

早上好,

我必须用Java解析一个巨大的xml文件(2GB)。它有很多标签,但我只需要写两个标签的内容<title><subtext>每次都在一个公共(public)文件中,所以我使用 SaxParse

到目前为止,我已经成功在输出文件中写入了 1M95 文本,到那时就会发生此异常:

org.xml.sax.SAXParseException; systemId: filePath; lineNumber: x; columnNumber: y; JAXP00010004 : La taille cumulée des entités est "50 000 001" et dépasse la limite de "50 000 000" définie par "FEATURE_SECURE_PROCESSING".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1465)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.checkEntityLimit(XMLScanner.java:1544)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.handleCharacter(XMLDocumentFragmentScannerImpl.java:1940)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1866)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3058)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:504)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:328)
at Parsing.main(Class.java:38)

异常的翻译如下:

The cumulative size of the entities is "50 000 001" which exceeds the boundary of "50 000 000" defined by "FEATURE_SECURE_PROCESSING".

这是我编写的代码:

public class Parsing {

public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {

try {
File inputFile = new File(System.getProperty("user.dir") + "/input.xml");
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
UserHandler userhandler = new UserHandler();
saxParser.parse(inputFile, userhandler);
} catch (Exception e) {
e.printStackTrace();
}

}

public static void doThingOne(String text, String title) throws IOException {

// Write the text and the title on a file
}


public static void doThingTwo(String text, String title) throws IOException {
//Write the text and the title on another file

}

class UserHandler extends DefaultHandler {

boolean bText = false;
boolean bTitle = false;
StringBuffer tagTextBuffer;
StringBuffer tagTitleBuffer;
String text = null;
String title = null;

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {

if (qName.equals("title")) {
tagTitleBuffer = new StringBuffer();
bTitle = true;
} else if (qName.equalsIgnoreCase("text")) {
tagTextBuffer = new StringBuffer();
bText = true;
}
}

public void endElement(String uri, String localName, String qName) throws SAXException {
if (qName.equals("title")) {
bTitle = false;
title = tagTextBuffer.toString();

} else if (qName.equals("text")) {
text = tagTextBuffer.toString();
bText = false;
if (text!=null && title == "One") {
try {
Parsing.doThingOne(page, title);
} catch (IOException e) {
e.printStackTrace();
}
} else if (text != null) {
try {
Parsing.doThingTwo(page, title);
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

public void characters(char ch[], int start, int length) throws SAXException {
if (bTitle) {
tagTitleBuffer.append(new String(ch, start, length));
} else if (bText) {
tagTextBuffer.append(new String(ch, start, length));
}
}
}

感谢您的宝贵时间。

最佳答案

关闭FEATURE_SECURE_PROCESSING没有效果(Java8)。要增加限制,请使用:

System.setProperty("jdk.xml.totalEntitySizeLimit", String.valueOf(Integer.MAX_VALUE));

在 SAXParserFactory.newInstance() 之前;

关于java - SAXParseException;系统ID : cumulative size of entities exceeds bound,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41234013/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com