gpt4 book ai didi

java - 尝试运行 pdfbox 程序时出错

转载 作者:太空宇宙 更新时间:2023-11-04 04:10:59 24 4
gpt4 key购买 nike

我尝试从此页面运行 Pdfbox 示例:http://www.printmyfolders.com/Home/PDFBox-Tutorial从 PDF 文件中提取文本。当我尝试运行它时,出现错误:

org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:245)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1192)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1159)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1130)
at GetPos.main(GetPos.java:14)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(libgcj.so.10)
at java.io.ByteArrayOutputStream.write(libgcj.so.10)
at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:172)
at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:98)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:295)
at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:237)
at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:172)
at org.apache.pdfbox.pdfparser.PDFXrefStreamParser.<init>(PDFXrefStreamParser.java:61)
at org.apache.pdfbox.pdfparser.PDFParser.parseXrefStream(PDFParser.java:848)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:576)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:188)
...4 more

这是什么意思? 第一个带有空白 pdf 的示例效果很好。

最佳答案

使用示例生成带有文本的 PDF,然后通过相关教程阅读该文本

package com.mycompany.mavenproject;

import java.io.File;
import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.edit.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.util.PDFTextStripper;

/**
* Unit test for simple App.
*/
public class AppTest
extends TestCase {

public static Test suite() {
return new TestSuite(AppTest.class);
}

public void test() throws Exception {
final String fileName = "PDFWithText.pdf";
writeDocument(fileName);
PDDocument pd = PDDocument.load(new File(fileName));
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(pd);
assertEquals("Hello from www.printmyfolders.com", text.trim());
}

private void writeDocument(String fileName) throws Exception {
PDDocument doc = new PDDocument();
PDPage page = new PDPage();

doc.addPage(page);
PDFont font = PDType1Font.HELVETICA_BOLD;

PDPageContentStream content = new PDPageContentStream(doc, page);
content.beginText();
content.setFont(font, 12);
content.moveTextPositionByAmount(100, 700);
content.drawString("Hello from www.printmyfolders.com");

content.endText();
content.close();
doc.save(fileName);
doc.close();
}
}

工作无异常。考虑到加载方法中出现的异常,请确保 PDF 没有格式错误。

关于java - 尝试运行 pdfbox 程序时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19060837/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com