gpt4 book ai didi

c# - iTextSharp HTMLWorker.ParseToList() 抛出 NullReferenceException

转载 作者:太空狗 更新时间:2023-10-29 13:28:53 24 4
gpt4 key购买 nike

我正在使用 iTextSharp v.4 合并一大堆 html 文件。它工作正常,直到我需要升级到 iTextSharp v.5。

当我将流读取器(读取 html 文件的内容)传递给 HTMLWorker 对象的 ParseToList 方法时,问题就来了。它抛出空引用异常。在调试它时,我可以访问 streamReader 并可以确认读取了正确的文件内容。

代码如下:

List<IElement> objects;
try
{
objects = HTMLWorker.ParseToList(new StringReader(htmlString), null);
}
catch (Exception e)
{
htmlString = "<html><head></head><body><br/><br/><h2 style='color:#FF0000'>ERROR READING FILE!</h2><h3>File Excluded From Stitched Document!</h3><br/><br/><p>There was an error while trying to read the following file:</p><p><span style='color:#FF0000'>" + fileName + "</span></p></body></html>";
objects = HTMLWorker.ParseToList(new StringReader(htmlString), null);
}

在 catch block 中,您会看到我随后使用几乎相同的代码向 pdf 添加文本以说明存在问题。此代码工作正常。这当然让我认为问题出在原始 html 字符串的内容上,所以这里是字符串的内容,因为它是在传递到解析器之前:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="cache-control" content="no-cache" />
</head>
<body style="font-family: Arial, Helvetica, sans-serif; font-size: 1em; margin: 0;
padding: 0;">
<div style="font-size: 1em; line-height: 1.25em; width: 190mm;">
<h1 style="font-size: 1.5em; font-weight: bold; margin: 0 0 1.5em 0; text-align: center;">
Advice Item 1</h1>
<table border="0" style="width: 190mm; border-collapse: collapse; margin: 0 0 1.5em 0;
width: 100%;">
<tbody>
<tr>
<td style="width: 35mm; height: 1px; line-height: 1px; font-size: 1px;">
&nbsp;
</td>
<td>
</td>
<td style="width: 30mm; height: 1px; line-height: 1px; font-size: 1px;">
&nbsp;
</td>
<td>
</td>
</tr>
<tr>
<td colspan="4" style="font-weight: bold;">
<span id="litPatchedToCC" style="text-align: right; font-weight: bold;"></span>
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
By:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
ABC
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
From:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
CC
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Date:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
29/03/2011 13:35
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
To:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
Member Practice
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Folder:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
A15-123456
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Individual:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
Miss A B Test
</td>
</tr>
<tr>
<td colspan="2">
<hr width="100%" />
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Of:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
Lorem &amp; Ipsum
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Species:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
Bovine
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Position:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
Member
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Item Type:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Tel:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
0123 01234
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
</th>
<td style="font-weight: bold; padding: 2px 5px;">
</td>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Other Nos:
</th>
<td style="font-weight: bold; padding: 2px 5px;">
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Reason For Call:
</th>
<td colspan="3" style="font-weight: bold; padding: 2px 5px;">
Some Reason
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
Subject:
</th>
<td colspan="3" style="font-weight: bold; padding: 2px 5px;">
Some problem.
</td>
</tr>
<tr>
<th scope="row" style="text-align: right; font-weight: normal; padding: 2px 5px;">
</th>
<td>
</td>
<th scope="row" colspan="2" style="text-align: right; font-weight: normal; padding: 2px 5px;">
</th>
<td colspan="2">
</td>
</tr>
<tr>
<td style="font-size: 1.5em; font-weight: bold; text-align: center;" colspan="4">
Internal
</td>
</tr>
<tr>
<td colspan="4" style="text-align: center; padding: 2px 5px;">
<hr width="100%" />
</td>
</tr>
</tbody>
</table>
<div style="padding: 2px 5px;">
<p>
Here we start the discussion.</p>
<br />
<p>
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
<br />
<p>
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
</div>
</div>
</body>
</html>

感谢您的帮助。霍夫纳威利

最佳答案

看起来像HTMLWorker被两个呛到<hr width="100%" /> .既然你说要升级到 V5.XX,开始使用 XMLWorker 可能也不错开始解析您的 HTML - 开发团队推荐它。 (最新的 HTMLWorker 源代码甚至有一个小引用指出了这一点)

使用您的扩展 HTML 进行测试,它可以工作,并且实现起来还不错 :)

using (Document document = new Document()) {
PdfWriter writer = PdfWriter.GetInstance(document, Response.OutputStream);
document.Open();
try {
StringReader sr = new StringReader(htmlString);
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, sr
);
}
catch (Exception e) {
throw;
}
}

在web环境下测试,所以替换Response.OutputStreamStream您的选择。

关于c# - iTextSharp HTMLWorker.ParseToList() 抛出 NullReferenceException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8574174/

24 4 0
文章推荐: javascript - 确定动态 CSS3 多列 DIV 宽度的宽度 fixed column-width
文章推荐: android - 启动/绑定(bind)服务生命周期。为什么要重新创建
文章推荐: 安卓启动镜像格式
文章推荐: HTML5 结构 -
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com