gpt4 book ai didi

c# - 如何使用指定行号的 LINQ 读取文本文件

转载 作者:行者123 更新时间:2023-12-05 05:38:54 25 4
gpt4 key购买 nike

我有一个文本文件。我需要找到以“3”开头的行号总数,并且位置行中可用的文件中已经提供了以“7200”开头的行号总数 - 位置以 05 开头,长度为 6。类似的方式.总金额也可用在以“7200”开头的行中 - 位置以 21 开头,长度为 12。

.txt File Data

211 87236486287346872837468724682871238483XYZ BANK             
1200ABCDEF 8128361287AXTAKJ COLL
3270210000893281012870095628 00002500 8981273687jhgsjhdg

3270210000896281712870095628 00002500 1231273687jhgajhdj

3270210000891286712870095628 00002500 4561273687cxvnmbal

3270210000899283612870095628 00002500 7891273687nmkdkjhk

720000000400021000080000000100000000000000008128361287
9000001000001000000010002100008000000010000000000000000

例如:在我的文件中,以 3 开头的行总数在以“7”开头的行中可用,即“000004”

总金额以“7”开头,即“000000010000”

目前我正在使用下面的 C# 代码循环整个文件并导航到以 7 开头的行并读取上述位置可用的值,但由于文件记录数可能会花费太多时间太大了,像 200K

foreach (var line in FileLines)
{
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}

switch (line.Substring(1, 1))
{
case 7:
totalCount = int.Parse(line.Substring(4, 6));
TotalAmount = line.Substring(20, 12);
break;

default:
throw new Exception;
}
}

有什么方法可以使用 LINQ 重写我的代码,从而获得更好的性能?

最佳答案

这是 Linq 语句。使它更有效的是它使用 Reverse 因为你提到你正在寻找的信息在页脚中。

static void Main(string[] args)
{
var path = Path.Combine(
Path.GetDirectoryName(Assembly.GetEntryAssembly().Location),
"TextFile.txt");
try
{
var count =
int.Parse(
File.ReadAllLines(path)
.Reverse()
.First(line => line.Any() && (line.First() == '7'))
.Substring(4, 6));
Console.WriteLine($"Count = {count}");
}
catch (Exception ex)
{
System.Diagnostics.Debug.Assert(false, ex.Message);
}
}

console output

编辑

你问了一个关于性能的好问题。最棒的是我们不必推测或猜测! 总有衡量绩效的方法。

这是我刚刚放在一起的基准。看,我做的真的很快,所以如果有人发现我遗漏了什么,请指出。但这是我得到的:

static void Main(string[] args)
{
var path = Path.Combine(
Path.GetDirectoryName(Assembly.GetEntryAssembly().Location),
"TextFile.txt");
try
{
// 200K lines of random guids
List<string> builder =
Enumerable.Range(0, 200000)
.Select(n => $"{{{System.Guid.NewGuid().ToString()}}}")
.ToList();

var footer =
File.ReadAllLines(path);

builder.AddRange(footer);

var FileLines = builder.ToArray();

var benchmark = new System.Diagnostics.Stopwatch();
benchmark.Start();
int totalCount = int.MinValue;
foreach (var line in FileLines)
{
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}
// Original code from post
// switch (line.Substring(1, 1))
// Should be:
switch (line.Substring(0, 1))
{
case "7":
totalCount = int.Parse(line.Substring(4, 6));
// This is another issue!! Breaking from the switch DOESN'T break from the loop
break;
// SHOULD BE: goto breakFromInner;
// One of the few good reasons to use a goto statement!!
}
}
benchmark.Stop();
Console.WriteLine($"200K lines using Original code: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {totalCount}");


benchmark.Restart();
for (int i = FileLines.Length - 1; i >= 0; i--)
{
var line = FileLines[i];
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}
// Original code from post
// switch (line.Substring(1, 1))
// Should be:
switch (line.Substring(0, 1))
{
case "7":
totalCount = int.Parse(line.Substring(4, 6));
// One of the few good reasons to use a goto statement!!
goto breakFromInner;
}
}
// See note
breakFromInner:
benchmark.Stop();
Console.WriteLine($"200K lines using Original code with reverse: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {totalCount}");

benchmark.Restart();
var count =
int.Parse(
FileLines
.Reverse()
.First(line => line.Any() && (line.First() == '7'))
.Substring(4, 6));
benchmark.Stop();
Console.WriteLine($"200K lines using Linq with Reverse: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {count}");
}
catch (Exception ex)
{
System.Diagnostics.Debug.Assert(false, ex.Message);
}
}

console output

关于c# - 如何使用指定行号的 LINQ 读取文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72823078/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com