gpt4 book ai didi

java - 解析 XML 文件时如何测试内存限制

转载 作者:行者123 更新时间:2023-12-01 17:52:03 25 4
gpt4 key购买 nike

我正在尝试运行到OutOfMemoryException点。我的方法创建一个文件,解析如果没有错误,然后立即删除该文件,清除垃圾收集并生成一个更大的文件并重复。然而,大文件会消耗太多时间和 CPU。有没有更好的方法来做到这一点?谢谢。

 public static void main(String[] args) {
for (int i = 6000000; i <= 10000000; i+=100000) {
WriteXml(i);
try {
File fXmlFile = new File("limit.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
new Thread() {
public void run() {
try {
dBuilder.parse(fXmlFile);
} catch (SAXException | IOException e) {
e.printStackTrace();
}
}
fXmlFile.delete();
};
} catch (Exception e) {
e.printStackTrace();
System.out.println(i);
}
System.gc();
}
}

最佳答案

我怀疑(根据我的观察),如果你尝试编写一个非常扁平的树(例如,10m 个元素作为根的子元素),那么你在末尾添加新兄弟时会达到 O(n^2) 性能。一个很长的列表,在内存耗尽之前你就会耗尽时间(或耐心)。

我使用 Saxon API 编写了一个小测试,以便我可以使用不同的树模型尝试这个(也许您可以使用相同的想法):

    public void testDomSizeLimits() {
try {
for (int i=1; i<Integer.MAX_VALUE; i*=2) {
System.err.println("Trying size " + i);
Configuration config = new Configuration();
// Change the next line depending on the chosen tree model
TinyBuilder writer = new TinyBuilder(config.makePipelineConfiguration());
Location loc = ExplicitLocation.UNKNOWN_LOCATION;
writer.open();
writer.startDocument(0);
writer.startElement(new NoNamespaceName("doc"), Untyped.getInstance(), loc, 0);
for (int j=0; j<i; j++) {
writer.startElement(new NoNamespaceName("elem"), Untyped.getInstance(), loc, 0);
writer.characters("The quick brown fox", loc, 0);
writer.endElement();
}
writer.endDocument();
writer.close();
}
} catch (XPathException e) {
e.printStackTrace();
}
}

在大约 16M 记录之后,DOM 和 JDOM2 都变得慢得难以忍受。然而,Saxon 的 TinyTree 会继续运行,直到内存耗尽为止:

Trying size 1
Trying size 2
Trying size 4
Trying size 8
Trying size 16
Trying size 32
Trying size 64
Trying size 128
Trying size 256
Trying size 512
Trying size 1024
Trying size 2048
Trying size 4096
Trying size 8192
Trying size 16384
Trying size 32768
Trying size 65536
Trying size 131072
Trying size 262144
Trying size 524288
Trying size 1048576
Trying size 2097152
Trying size 4194304
Trying size 8388608
Trying size 16777216
Trying size 33554432
Trying size 67108864

java.lang.OutOfMemoryError: Java heap space

at java.util.Arrays.copyOf(Arrays.java:3284)
at net.sf.saxon.tree.tiny.TinyTree.ensureNodeCapacity(TinyTree.java:233)
at net.sf.saxon.tree.tiny.TinyTree.addNode(TinyTree.java:345)
at net.sf.saxon.tree.tiny.TinyBuilder.makeTextNode(TinyBuilder.java:405)
at net.sf.saxon.tree.tiny.TinyBuilder.characters(TinyBuilder.java:381)
at jaxptest.DOMTest.testDomSizeLimits(DOMTest.java:1424)

这是在 IntelliJ 下使用默认堆大小运行的。

更合理的测试可能会随着节点数量的增加而增加树的深度。今天没有时间尝试。

关于java - 解析 XML 文件时如何测试内存限制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49013886/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com