gpt4 book ai didi

java - 使用波兰语字母的 iText 和 XMLWorker 将 HTML 转换为 PDF

转载 作者:塔克拉玛干 更新时间:2023-11-01 22:44:13 32 4
gpt4 key购买 nike

我有一个带有示例的字符串 - 它工作得非常好,但是当我添加波兰语字母时,它们就不见了。我试过这样的事情:

        byte[] byteArray = str.getBytes(Charset.forName("UTF-8"));
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteArray);
worker.parseXHtml(pdfWriter, document, byteArrayInputStream, Charset.forName("UTF-8"));

但它并没有改变任何东西。如何添加波兰语字母?

编辑:它仍然不起作用。

代码:

        document.open();

XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
String str = "<html><head></head><body style=\"font-size:12.0pt; font-family:Times New Roman\">"+
"<a href='http://www.rgagnon.com/howto.html'><b>Real's HowTo</b></a>" +
"<h1>Show your support</h1>" +
"<p>It DOES cost a lot to produce this site - in ISP storage and transfer fees</p>" +
"<p>TEST POLSKICH ZNAKÓW: ĄąćCÓ󣳯żŹźĘę</p>" +
"<hr/>" +
"<p>the huge amounts of time it takes for one person to design and write the actual content.</p>" +
"<p>If you feel that effort has been useful to you, perhaps you will consider giving something back?</p>" +
"<p>Donate using PayPalŽ</p>" +
"<p>Contributions via PayPal are accepted in any amount</p>" +
"<p><br/><table border='1'><tr><td>Java HowTo</td></tr><tr>" +
"<td style='background-color:red;'>Javascript HowTo</td></tr>" +
"<tr><td>Powerbuilder HowTo</td></tr></table></p>" +
"</body></html>";

byte[] byteArray = str.getBytes(Charset.forName("UTF-8"));
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteArray);
worker.parseXHtml(pdfWriter, document, byteArrayInputStream, Charset.forName("UTF-8"));

document.close();

也许有人会发现一个错误。

最佳答案

我已经使用了您的示例 HTML 并使用它来创建 ParseHtml2例子。生成的 PDF,html_2.pdf ,看起来像这样:

enter image description here

乍一看,我没有发现波兰语字符有任何问题。

我使用的代码如下所示:

public void createPdf(String file) throws IOException, DocumentException {
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
String str = "<html><head></head><body style=\"font-size:12.0pt; font-family:Times New Roman\">"+
"<a href='http://www.rgagnon.com/howto.html'><b>Real's HowTo</b></a>" +
"<h1>Show your support</h1>" +
"<p>It DOES cost a lot to produce this site - in ISP storage and transfer fees</p>" +
"<p>TEST POLSKICH ZNAKÓW: \u0104\u0105\u0106\u0107\u00d3\u00f3\u0141\u0142\u0179\u017a\u017b\u017c\u017d\u017e\u0118\u0119</p>" +
"<hr/>" +
"<p>the huge amounts of time it takes for one person to design and write the actual content.</p>" +
"<p>If you feel that effort has been useful to you, perhaps you will consider giving something back?</p>" +
"<p>Donate using PayPal\u017d</p>" +
"<p>Contributions via PayPal are accepted in any amount</p>" +
"<p><br/><table border='1'><tr><td>Java HowTo</td></tr><tr>" +
"<td style='background-color:red;'>Javascript HowTo</td></tr>" +
"<tr><td>Powerbuilder HowTo</td></tr></table></p>" +
"</body></html>";

XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
InputStream is = new ByteArrayInputStream(str.getBytes(StandardCharsets.UTF_8));
worker.parseXHtml(writer, document, is, Charset.forName("UTF-8"));
// step 5
document.close();
}

请注意,您已将 Times New Roman 定义为字体。您的操作系统必须能够访问具有该名称的字体,否则您最终还是会使用 Helvetica。

另请注意,在源代码中使用非 ASCII 字符被视为有损品位的犯罪行为。源代码存储为文本文件,但使用哪种编码?无法保证您的文件将存储为 UTF-8,无法保证编译器会将其读取为 UTF-8,无法保证版本控制系统会接受 UTF-8,...因此我更换了所有 UTF-8 字符的 unicode 值允许我将源文件保存在 ASCII 中。

关于java - 使用波兰语字母的 iText 和 XMLWorker 将 HTML 转换为 PDF,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29102552/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com