- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试使用 FileStream。寻求快速跳转到一行并阅读它。
但是,我没有得到正确的结果。我试图看这个有一段时间了,但不明白我做错了什么。
环境:
操作系统:Windows 7
架构:.NET 4.0
IDE:Visual C# Express 2010
文件位置中的示例数据:C:\Temp\Temp.txt
0001|100!25000002|100!25000003|100!25000004|100!25000005|100!25000006|100!25000007|100!25000008|100!25000009|100!25000010|100!2500
The code:
class PaddedFileSearch
{
private int LineLength { get; set; }
private string FileName { get; set; }
public PaddedFileSearch()
{
FileName = @"C:\Temp\Temp.txt"; // This is a padded file. All lines are of the same length.
FindLineLength();
Debug.Print("File Line length: {0}", LineLength);
// TODO: This purely for testing. Move this code out.
SeekMethod(new int[] { 5, 3, 4 });
/* Expected Results:
* Line No Position Line
* ------- -------- -----------------
* 3 30 0003|100!2500
* 4 15 0004|100!2500
* 5 15 0005|100!2500 -- This was updated after the initial request.
*/
/* THIS DOES NOT GIVE THE EXPECTED RESULTS */
SeekMethod(new int[] { 5, 3 });
/* Expected Results:
* Line No Position Line
* ------- -------- -----------------
* 3 30 0003|100!2500
* 5 30 0005|100!2500
*/
}
private void FindLineLength()
{
string line;
// Add check for FileExists
using (StreamReader reader = new StreamReader(FileName))
{
if ((line = reader.ReadLine()) != null)
{
LineLength = line.Length + 2;
// The 2 is for NewLine(\r\n)
}
}
}
public void SeekMethod(int[] lineNos)
{
long position = 0;
string line = null;
Array.Sort(lineNos);
Debug.Print("");
Debug.Print("Line No\t\tPosition\t\tLine");
Debug.Print("-------\t\t--------\t\t-----------------");
using (FileStream fs = new FileStream(FileName, FileMode.Open, FileAccess.Read, FileShare.None))
{
using (StreamReader reader = new StreamReader(fs))
{
foreach (int lineNo in lineNos)
{
position = (lineNo - 1) * LineLength - position;
fs.Seek(position, SeekOrigin.Current);
if ((line = reader.ReadLine()) != null)
{
Debug.Print("{0}\t\t\t{1}\t\t\t\t{2}", lineNo, position, line);
}
}
}
}
}
}
我得到的输出:
File Line length: 15Line No Position Line------- -------- -----------------3 30 0003|100!25004 15 0004|100!25005 45 0005|100!2500Line No Position Line------- -------- -----------------3 30 0003|100!25005 30 0004|100!2500
My problem is with the following output:
Line No Position Line------- -------- -----------------5 30 0004|100!2500
The output for Line should be: 0005|100!2500
I don't understand why this is happening.
Am I doing something wrong?Is there a workaround?Also are there faster ways to do this using something like seek?
(I am looking for code based options and NOT Oracle or SQL Server. For the sake of argument lets also say that the file size 1 GB.)
Any help is greatly appreciated.
Thanks.
UPDATE:
I found 4 great answers here. Thanks a lot.
Sample Timings:
Based on a few runs the following are the methods from best to good. Even the good is very close to best.
In a file that contains 10K lines, 2.28 MB. I searched for same 5000 random lines using all the options.
Shown below is the code. After saving the code you can simply call it by typing TestPaddedFileSeek.CallPaddedFileSeek();
. You will also have to specify the namespace and the "using references".
`
/// <summary>
/// This class multiple options of reading a by line number in a padded file (all lines are the same length).
/// The idea is to quick jump to the file.
/// Details about the discussions is available at: http://stackoverflow.com/questions/5201414/having-a-problem-while-using-filestream-seek-in-c-solved
/// </summary>
class PaddedFileSeek
{
public FileInfo File {get; private set;}
public int LineLength { get; private set; }
#region Private methods
private static int FindLineLength(FileInfo fileInfo)
{
using (StreamReader reader = new StreamReader(fileInfo.FullName))
{
string line;
if ((line = reader.ReadLine()) != null)
{
int length = line.Length + 2; // The 2 is for NewLine(\r\n)
return length;
}
}
return 0;
}
private static void PrintHeader()
{
/*
Debug.Print("");
Debug.Print("Line No\t\tLine");
Debug.Print("-------\t\t--------------------------");
*/
}
private static void PrintLine(int lineNo, string line)
{
//Debug.Print("{0}\t\t\t{1}", lineNo, line);
}
private static void PrintElapsedTime(TimeSpan elapsed)
{
Debug.WriteLine("Time elapsed: {0} ms", elapsed);
}
#endregion
public PaddedFileSeek(FileInfo fileInfo)
{
// Possibly might have to check for FileExists
int length = FindLineLength(fileInfo);
//if (length == 0) throw new PaddedProgramException();
LineLength = length;
File = fileInfo;
}
public void CallAll(int[] lineNoArray, List<int> lineNoList)
{
Stopwatch sw = new Stopwatch();
#region Seek1
// Create new stopwatch
sw.Start();
Debug.Write("Seek1: ");
// Print Header
PrintHeader();
Seek1(lineNoArray);
// Stop timing
sw.Stop();
// Print Elapsed Time
PrintElapsedTime(sw.Elapsed);
sw.Reset();
#endregion
#region Seek2
// Create new stopwatch
sw.Start();
Debug.Write("Seek2: ");
// Print Header
PrintHeader();
Seek2(lineNoArray);
// Stop timing
sw.Stop();
// Print Elapsed Time
PrintElapsedTime(sw.Elapsed);
sw.Reset();
#endregion
#region Seek3
// Create new stopwatch
sw.Start();
Debug.Write("Seek3: ");
// Print Header
PrintHeader();
Seek3(lineNoArray);
// Stop timing
sw.Stop();
// Print Elapsed Time
PrintElapsedTime(sw.Elapsed);
sw.Reset();
#endregion
#region Seek4
// Create new stopwatch
sw.Start();
Debug.Write("Seek4: ");
// Print Header
PrintHeader();
Seek4(lineNoList);
// Stop timing
sw.Stop();
// Print Elapsed Time
PrintElapsedTime(sw.Elapsed);
sw.Reset();
#endregion
}
/// <summary>
/// Option by Jake
/// </summary>
/// <param name="lineNoArray"></param>
public void Seek1(int[] lineNoArray)
{
long position = 0;
string line = null;
Array.Sort(lineNoArray);
using (FileStream fs = new FileStream(File.FullName, FileMode.Open, FileAccess.Read, FileShare.None))
{
using (StreamReader reader = new StreamReader(fs))
{
foreach (int lineNo in lineNoArray)
{
position = (lineNo - 1) * LineLength;
fs.Seek(position, SeekOrigin.Begin);
if ((line = reader.ReadLine()) != null)
{
PrintLine(lineNo, line);
}
reader.DiscardBufferedData();
}
}
}
}
/// <summary>
/// option by bitxwise
/// </summary>
public void Seek2(int[] lineNoArray)
{
string line = null;
long step = 0;
Array.Sort(lineNoArray);
using (FileStream fs = new FileStream(File.FullName, FileMode.Open, FileAccess.Read, FileShare.None))
{
// using (StreamReader reader = new StreamReader(fs))
// If you put "using" here you will get WRONG results.
// I would like to understand why this is.
{
foreach (int lineNo in lineNoArray)
{
StreamReader reader = new StreamReader(fs);
step = (lineNo - 1) * LineLength - fs.Position;
fs.Position += step;
if ((line = reader.ReadLine()) != null)
{
PrintLine(lineNo, line);
}
}
}
}
}
/// <summary>
/// Option by Valentin Kuzub
/// </summary>
/// <param name="lineNoArray"></param>
#region Seek3
public void Seek3(int[] lineNoArray)
{
long position = 0; // totalPosition = 0;
string line = null;
int oldLineNo = 0;
Array.Sort(lineNoArray);
using (FileStream fs = new FileStream(File.FullName, FileMode.Open, FileAccess.Read, FileShare.None))
{
using (StreamReader reader = new StreamReader(fs))
{
foreach (int lineNo in lineNoArray)
{
position = (lineNo - oldLineNo - 1) * LineLength;
fs.Seek(position, SeekOrigin.Current);
line = ReadLine(fs, LineLength);
PrintLine(lineNo, line);
oldLineNo = lineNo;
}
}
}
}
#region Required Private methods
/// <summary>
/// Currently only used by Seek3
/// </summary>
/// <param name="stream"></param>
/// <param name="length"></param>
/// <returns></returns>
private static string ReadLine(FileStream stream, int length)
{
byte[] bytes = new byte[length];
stream.Read(bytes, 0, length);
return new string(Encoding.UTF8.GetChars(bytes));
}
#endregion
#endregion
/// <summary>
/// Option by Ritch Melton
/// </summary>
/// <param name="lineNoArray"></param>
#region Seek4
public void Seek4(List<int> lineNoList)
{
lineNoList.Sort();
using (var fs = new FileStream(File.FullName, FileMode.Open))
{
lineNoList.ForEach(ln => OutputData(fs, ln));
}
}
#region Required Private methods
private void OutputData(FileStream fs, int lineNumber)
{
var offset = (lineNumber - 1) * LineLength;
fs.Seek(offset, SeekOrigin.Begin);
var data = new byte[LineLength];
fs.Read(data, 0, LineLength);
var text = DecodeData(data);
PrintLine(lineNumber, text);
}
private static string DecodeData(byte[] data)
{
var encoding = new UTF8Encoding();
return encoding.GetString(data);
}
#endregion
#endregion
}
static class TestPaddedFileSeek
{
public static void CallPaddedFileSeek()
{
const int arrayLenght = 5000;
int[] lineNoArray = new int[arrayLenght];
List<int> lineNoList = new List<int>();
Random random = new Random();
int lineNo;
string fileName;
fileName = @"C:\Temp\Temp.txt";
PaddedFileSeek seeker = new PaddedFileSeek(new FileInfo(fileName));
for (int n = 0; n < 25; n++)
{
Debug.Print("Loop no: {0}", n + 1);
for (int i = 0; i < arrayLenght; i++)
{
lineNo = random.Next(1, arrayLenght);
lineNoArray[i] = lineNo;
lineNoList.Add(lineNo);
}
seeker.CallAll(lineNoArray, lineNoList);
lineNoList.Clear();
Debug.Print("");
}
}
}
`
最佳答案
我对您的预期位置感到困惑,第 5 行在位置 30 和 45,第 4 行在 15,而第 3 行在 30?
这是读取逻辑的核心:
var offset = (lineNumber - 1) * LineLength;
fs.Seek(offset, SeekOrigin.Begin);
var data = new byte[LineLength];
fs.Read(data, 0, LineLength);
var text = DecodeData(data);
Debug.Print("{0,-12}{1,-16}{2}", lineNumber, offset, text);
完整示例在这里:
class PaddedFileSearch
{
public int LineLength { get; private set; }
public FileInfo File { get; private set; }
public PaddedFileSearch(FileInfo fileInfo)
{
var length = FindLineLength(fileInfo);
//if (length == 0) throw new PaddedProgramException();
LineLength = length;
File = fileInfo;
}
private static int FindLineLength(FileInfo fileInfo)
{
using (var reader = new StreamReader(fileInfo.FullName))
{
string line;
if ((line = reader.ReadLine()) != null)
{
var length = line.Length + 2;
return length;
}
}
return 0;
}
public void SeekMethod(List<int> lineNumbers)
{
Debug.Print("");
Debug.Print("Line No\t\tPosition\t\tLine");
Debug.Print("-------\t\t--------\t\t-----------------");
lineNumbers.Sort();
using (var fs = new FileStream(File.FullName, FileMode.Open))
{
lineNumbers.ForEach(ln => OutputData(fs, ln));
}
}
private void OutputData(FileStream fs, int lineNumber)
{
var offset = (lineNumber - 1) * LineLength;
fs.Seek(offset, SeekOrigin.Begin);
var data = new byte[LineLength];
fs.Read(data, 0, LineLength);
var text = DecodeData(data);
Debug.Print("{0,-12}{1,-16}{2}", lineNumber, offset, text);
}
private static string DecodeData(byte[] data)
{
var encoding = new UTF8Encoding();
return encoding.GetString(data);
}
}
class Program
{
static void Main(string[] args)
{
var seeker = new PaddedFileSearch(new FileInfo(@"D:\Desktop\Test.txt"));
Debug.Print("File Line length: {0}", seeker.LineLength);
seeker.SeekMethod(new List<int> { 5, 3, 4 });
seeker.SeekMethod(new List<int> { 5, 3 });
}
}
关于c# - 使用 FileStream.Seek,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5201414/
我需要对同一文件进行一批写入,但在文件内的不同位置。我想以尽可能最好的性能实现这一目标,因此查看了同步 FileStream.Write 和异步 FileStream.BeginWrite 方法。 同
MSDN说FileStream.Flush(True) “还清除所有中间文件缓冲区。”。 “所有中间文件缓冲区”到底是什么意思? 最佳答案 它会将缓冲在文件系统缓存中的文件数据写入磁盘。该数据通常是根
考虑以下摘自 Microsoft docs 的代码: using FileStream createStream = File.Create(fileName); // ...write to str
我对Spark的理解fileStream()方法是将三种类型作为参数:Key , Value , 和 Format .对于文本文件,适当的类型是:LongWritable , Text , 和 Tex
为什么 FileStream.Length 是 long 类型,但 FileStream.Read 参数 - offset 有更短的长度 int 呢? 布莱恩 最佳答案 offset 参数告诉从哪里开
我编写了以下程序,其目的是创建一个给定大小的文件,其中包含一些随机数据。该程序运行良好,并完成了它应该做的事情。但是,我不明白为什么它会消耗 5GB 的 RAM(请参阅我的任务管理器的屏幕截图)。当我
我在一次采访中被问到这个问题,我说答案是 Managed。面试官似乎很惊讶。我的问题是即使它访问一个文件( native /非托管资源),但这个类不是托管的吗?或者你认为我应该有一些后续问题以获得更多
我正在编写一些代码作为打开文件框架的一部分。该文件属于自定义类型,不应由我的应用程序的多个实例打开。为了停止打开多个文件,我使用文件流创建一个锁定文件,然后保持所述文件流打开。 这似乎可以防止我的应用
我正在使用 Apache Commons Net 的 FTPClient 从位于服务器上的文件中读取内容。仅读取一次时效果很好。但是当我尝试读取第二个文件时,FTPClient 的 InputStre
问题 有没有办法在 C# 中创建带偏移量的 FileStream?例如,如果我在偏移量 100 处打开 SomeFile.bin,Stream.Position 将等于 0,但读取和写入将偏移 100
我正在阅读一个简单的文本文件,其中包含使用文件流类的单行。但似乎 filestream.read 在开头添加了一些垃圾字符。 代码下方。 using (var _fs = File.Open(_idF
我正在使用 FileStream 将 FTP 服务器的信息下载到我的 C:\驱动器上的目录中。出于某种原因,即使我什至尝试将目录权限设置为“所有人”访问权限,它也给了我这个异常(exception):
我正在尝试通过将文件作为参数的 API 上传 .srt 文件。 文件存储在服务器上,我正在使用 FileStream 和 StreamWriter 写入: string path = Server.M
我四处搜索了一下,但找不到能完美解决我的问题的东西。我有一些代码,即来 self 的数据库的 FileStream varbinary,并将其制作成客户端计算机上的文件,双击时可以在文件类型的默认应用
我最近在做一个涉及很多FileStreaming 的项目,这是我以前没有真正接触过的。 为了尝试更好地熟悉这些方法的原理,我编写了一些代码(理论上)将文件从一个 dir 下载到另一个,并逐步完成,在我
我通过例如下载文件5 个线程。当其中一个线程完成下载文件部分时 - 它被中止,但所有其余线程都有 ThreadState = WaitSleepJoin 并且显然停止下载。如何解决? while ((
我试图将 5 GB 的 ISO 文件复制到具有 29 GB 可用空间的 32 GB 闪存驱动器上。 Windows 7 拒绝让我拖放文件到闪存驱动器,报告文件对于目标文件系统来说太大了。 我最终了解到
我发现将 BufferedStream 与 FileStream 结合使用没有意义,因为它有自己的缓冲策略。然而,我想知道一件事: FileStream fsWithBuffer = new File
我有一个只读的 FileStream,它是一个方法局部变量: public void SomeMethod() { var fileStream = File.Open(fileName, Fi
我有两个文件流,它们从不同的文件中收集不同的信息: FileStream dataStruc = new FileStream("c:\\temp\\dataStruc.txt", FileMode.
我是一名优秀的程序员,十分优秀!