gpt4 book ai didi

c# - 如何在 C# 中逐行读取 PDF 文件?

转载 作者:太空狗 更新时间:2023-10-30 00:40:41 25 4
gpt4 key购买 nike

在我的 Windows 8 应用程序中,我想逐行阅读 PDF,然后我想分配一个字符串数组。我该怎么做?

    public StringBuilder addd= new StringBuilder();
string[] array;

private async void btndosyasec_Click(object sender, RoutedEventArgs e)
{
FileOpenPicker openPicker = new FileOpenPicker();
openPicker.ViewMode = PickerViewMode.List;
openPicker.SuggestedStartLocation = PickerLocationId.PicturesLibrary;
openPicker.FileTypeFilter.Add(".pdf");

StorageFile file = await openPicker.PickSingleFileAsync();



if (file != null)
{

PdfReader reader = new PdfReader((await file.OpenReadAsync()).AsStream());

for (int page = 1; page <= reader.NumberOfPages; page++)
{

addd.Append(PdfTextExtractor.GetTextFromPage(reader, page));
string tmp= PdfTextExtractor.GetTextFromPage(reader, page);

array[page] = tmp.ToString();

reader.Close();
}
}
}

最佳答案

您好,我也遇到了这个问题,我使用了这段代码,它有效。

您将需要对 iTextSharp 库的引用。

using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;

PdfReader reader = new PdfReader(@"D:\test pdf\Blood Journal.pdf");
int intPageNum = reader.NumberOfPages;
string[] words;
string line;

for (int i = 1; i <= intPageNum; i++)
{
text = PdfTextExtractor.GetTextFromPage(reader, i, new LocationTextExtractionStrategy());

words = text.Split('\n');
for (int j = 0, len = words.Length; j < len; j++)
{
line = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(words[j]));
}
}

words 数组包含 pdf 文件行

关于c# - 如何在 C# 中逐行读取 PDF 文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25424816/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com