gpt4 book ai didi

java - 读取 pdf 时 ScratchFileBuffer 未关闭消息

转载 作者:行者123 更新时间:2023-12-02 01:25:14 25 4
gpt4 key购买 nike

我有这段代码可以从 pdf 中读取并提取字符串。

它运行良好,但是日志重复抛出此消息,我不知道为什么:

public class Test {
public static void main(String[] args) {

PDDocument doc = null;
try {
doc = PDDocument.load(new File("C:/prueba.pdf"));
PDFTextStripper pdfs = new PDFTextStripper();
String textOfPdf = "";

textOfPdf = pdfs.getText(doc);
String regex = "([A-Z0-9]{5}-[A-Z0-9]{5}-[A-Z0-9]{5}-[A-Z0-9]{5}-[A-Z0-9]{5}-[A-Z0-9]{5})";
Pattern patron = Pattern.compile(regex);

Matcher emparejador = patron.matcher(textOfPdf);
emparejador.find();
String text = emparejador.group(0);

System.out.print(text);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (doc != null) {
doc.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

12:52:37.335 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{25, 0}
12:52:37.336 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{26, 0}
12:52:37.336 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{28, 0}
12:52:37.336 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{27, 0}
12:52:37.336 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{30, 0}
12:52:37.337 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{31, 0}
12:52:37.338 [main] DEBUG org.apache.pdfbox.pdfparser.PDFObjectStreamParser - parsed=COSObject{5, 0}

12:52:37.772 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.772 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.772 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!
12:52:37.773 [Finalizer] DEBUG org.apache.pdfbox.io.ScratchFileBuffer - ScratchFileBuffer not closed!

我也尝试过 tess4j 库,但发生了同样的事情。有什么想法吗?

问候

最佳答案

这很可能是内部解析器问题。从表面上看,某些 PDF 对象没有显式关闭它们使用的临时文件,而是在 Finalize 方法中关闭。

对我来说这似乎不是问题,除了关闭该类的调试级别日志记录之外,您无能为力。

log4j.logger.org.apache.pdfbox.io.ScratchFileBuffer=WARN

关于java - 读取 pdf 时 ScratchFileBuffer 未关闭消息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57590598/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com