gpt4 book ai didi

c# - 使用 C# 将文本文档分成多个部分

转载 作者:太空宇宙 更新时间:2023-11-03 16:39:07 25 4
gpt4 key购买 nike

我正在解析一个具有半已知重复结构的文本文件。有一个标题(1 行)、一个副标题(1 行或 2 行)和一个内容区域(任意行数)。文档中每一项的格式如下所示:

 =========================       Head Text 1=========================      SubHead Text1      SubHead Text2========================= Content Text Line 1 Content Text Line 2 ... Content Text Line 8=========================       Head Text 2=========================      SubHead Text1      SubHead Text2========================= Content Text Line 1 Content Text Line 2 ... Content Text Line 6

I would like each section to be inside a unique object, each with 3 sections... somethign like

section1.headsection1.subHeadsection1.contentsection2.headsection2.subHeadsection2.content

The only way I can think of accomplishing this involves a lot of if and while statements. Is there an efficient way of accomplishing this?

I originally tried writing some code in JScipt, but I'm reading a RTF file and C# provides an easy way of converting RTF to plain text. It didn't work very well, I kept skipping some dividers and would get an error at the end of the file.

page = new Array();

fso = new ActiveXObject("Scripting.FileSystemObject");
f = fso.GetFile("test.rtf");

is = f.OpenAsTextStream( forReading, -2 );

var count = 0;
while( !is.AtEndOfStream ){
page[count] = is.ReadLine();
count++; ;
}

is.Close();

WScript.Echo( page[0].text);

var item = [];

var section = 0;

var i = 0, k = 0;
while (i < page.length) {
item[k] = {};

if (!page[i].indexOf("=====")) {
i++;
item[k].head = page[i];
i+=2;
while(page[i].indexOf("=====")) { // WScript.Echo( "index = " + i + " "+ page[i] +"\n" + "Next index = " + (i+1) + " "+ page[i+1] +"\n" );
item[k].subHead += page[i];
i++;
}

k++;

}
i++;
}

最佳答案

如果你想削减 IF,你可以实现一个状态模式,将每一行提交到当前状态。

http://en.wikipedia.org/wiki/State_pattern

关于c# - 使用 C# 将文本文档分成多个部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8189239/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com