gpt4 book ai didi

c# - 读取包含不规则换行符的文件

转载 作者:太空宇宙 更新时间:2023-11-03 12:48:02 26 4
gpt4 key购买 nike

我正在尝试用 C# 读取一个格式如下的文本文件:

this is a line\r\n
this is a line\r
\r\n
this is a line\r
\r\n
this is a line\r
\r\n
this is a line\r\n
this is a line\r
\r\n
etc...

我正在读取文件中的每一行

StreamReader.ReadLine()

但这不会保留换行符。我需要知道/检测有什么样的换行符,因为我正在计算每行的字节数。例如:

如果行以字符 \r 结尾,行包括:((nr-of-bytes-in-line) + 1 byte) 字节(取决于当然是关于编码类型),如果行以 \r\n 结尾,则行包括:((nr-of-bytes-in-line) + 2 bytes) 字节。

编辑:

我有解决方案,基于israel altar 的回答。顺便说一句:Jon Skeet 也建议这样做。我已经实现了 ReadLine 的覆盖版本,因此它会包含换行符。这是重写函数的代码:

    public override String ReadLine()
{
StringBuilder sb = new StringBuilder();
while (true)
{
int ch = Read();
if (ch == -1)
{
break;
}
if (ch == '\r' || ch == '\n')
{
if (ch == '\r' && Peek() == '\n')
{
sb.Append('\r');
sb.Append('\n');
Read();
break;
}
else if(ch == '\r' && Peek() == '\r')
{
sb.Append('\r');
break;
}
}
sb.Append((char)ch);
}
if (sb.Length > 0)
{
return sb.ToString();
}
return null;
}

最佳答案

根据.net资源,readline是这样实现的:

// Reads a line. A line is defined as a sequence of characters followed by
// a carriage return ('\r'), a line feed ('\n'), or a carriage return
// immediately followed by a line feed. The resulting string does not
// contain the terminating carriage return and/or line feed. The returned
// value is null if the end of the input stream has been reached.
//
public virtual String ReadLine()
{
StringBuilder sb = new StringBuilder();
while (true) {
int ch = Read();
if (ch == -1) break;
if (ch == '\r' || ch == '\n')
{
if (ch == '\r' && Peek() == '\n') Read();
return sb.ToString();
}
sb.Append((char)ch);
}
if (sb.Length > 0) return sb.ToString();
return null;
}

如您所见,您可以添加一个 if 语句,如下所示:

 if (ch == '\r') 
{
//add the amount of bytes wanted
}
if (ch == '\n')
{
//add the amount of bytes wanted
}

或者做任何你想要的操作。

关于c# - 读取包含不规则换行符的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36572492/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com