gpt4 book ai didi

c# - 有没有办法逐行阅读word文档

转载 作者:行者123 更新时间:2023-11-30 22:36:34 26 4
gpt4 key购买 nike

我正在尝试提取 Word 文档中的所有单词。我可以按如下方式一次性完成所有工作......

Word.Application word = new Word.Application();
doc = word.Documents.Open(@"C:\SampleText.doc");
doc.Activate();

foreach (Word.Range docRange in doc.Words) // loads all words in document
{
IEnumerable<string> sortedSubstrings = Enumerable.Range(0, docRange.Text.Trim().Length)
.Select(i => docRange.Text.Substring(i))
.OrderBy(s => s.Length < 3 ? s : s.Remove(2, Math.Min(s.Length - 2, 2)));

wordPosition =
(int)
docRange.get_Information(
Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterColumnNumber);

foreach (var substring in sortedSubstrings)
{
index = docRange.Text.IndexOf(substring) + wordPosition;
charLocation[index] = substring;
}
}

但是我更愿意一次加载一行文档...是否可以这样做?

我可以按段落加载它,但是我无法遍历段落以提取所有单词。

foreach (Word.Paragraph para in doc.Paragraphs)
{
foreach (Word.Range docRange in para) // Error: type Word.para is not enumeranle**
{
IEnumerable<string> sortedSubstrings = Enumerable.Range(0, docRange.Text.Trim().Length)
.Select(i => docRange.Text.Substring(i))
.OrderBy(s => s.Length < 3 ? s : s.Remove(2, Math.Min(s.Length - 2, 2)));

wordPosition =
(int)
docRange.get_Information(
Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterColumnNumber);

foreach (var substring in sortedSubstrings)
{
index = docRange.Text.IndexOf(substring) + wordPosition;
charLocation[index] = substring;
}

}
}

最佳答案

这有助于您逐行获取字符串。

    object file = Path.GetDirectoryName(Application.ExecutablePath) + @"\Answer.doc";

Word.Application wordObject = new Word.ApplicationClass();
wordObject.Visible = false;

object nullobject = Missing.Value;
Word.Document docs = wordObject.Documents.Open
(ref file, ref nullobject, ref nullobject, ref nullobject,
ref nullobject, ref nullobject, ref nullobject, ref nullobject,
ref nullobject, ref nullobject, ref nullobject, ref nullobject,
ref nullobject, ref nullobject, ref nullobject, ref nullobject);

String strLine;
bool bolEOF = false;

docs.Characters[1].Select();

int index = 0;
do
{
object unit = Word.WdUnits.wdLine;
object count = 1;
wordObject.Selection.MoveEnd(ref unit, ref count);

strLine = wordObject.Selection.Text;
richTextBox1.Text += ++index + " - " + strLine + "\r\n"; //for our understanding

object direction = Word.WdCollapseDirection.wdCollapseEnd;
wordObject.Selection.Collapse(ref direction);

if (wordObject.Selection.Bookmarks.Exists(@"\EndOfDoc"))
bolEOF = true;
} while (!bolEOF);

docs.Close(ref nullobject, ref nullobject, ref nullobject);
wordObject.Quit(ref nullobject, ref nullobject, ref nullobject);
docs = null;
wordObject = null;

Here是代码背后的天才。请点击链接以获取有关其工作原理的更多说明。

关于c# - 有没有办法逐行阅读word文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6924056/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com