gpt4 book ai didi

c# - 为什么 C# 内存流要预留这么多内存?

转载 作者:IT王子 更新时间:2023-10-28 23:28:55 26 4
gpt4 key购买 nike

我们的软件通过 GZipStream 解压缩某些字节数据,该 GZipStreamMemoryStream 读取数据。这些数据以 4KB 的 block 为单位解压缩并写入另一个 MemoryStream

我们已经意识到进程分配的内存远高于实际解压后的数据。

示例:具有 2,425,536 字节的压缩字节数组被解压缩为 23,050,718 字节。我们使用的内存分析器显示方法 MemoryStream.set_Capacity(Int32 value) 分配了 67,104,936 字节。这是保留内存和实际写入内存之间的 2.9 倍。

注意:MemoryStream.set_Capacity 是从 MemoryStream.EnsureCapacity 调用的,而 MemoryStream.EnsureCapacity 本身又是从我们函数中的 MemoryStream.Write 调用的。

为什么 MemoryStream 保留这么多容量,即使它只追加 4KB 的 block ?

这是解压数据的代码片段:

private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream())
{
byte[] buffer = new byte[4096];
int iCount = 0;

while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}

注意:如果相关,这是系统配置:

  • Windows XP 32 位,
  • .NET 3.5
  • 使用 Visual Studio 2008 编译

最佳答案

因为this is the algorithm了解它如何扩展其容量。

public override void Write(byte[] buffer, int offset, int count) {

//... Removed Error checking for example

int i = _position + count;
// Check for overflow
if (i < 0)
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));

if (i > _length) {
bool mustZero = _position > _length;
if (i > _capacity) {
bool allocatedNewArray = EnsureCapacity(i);
if (allocatedNewArray)
mustZero = false;
}
if (mustZero)
Array.Clear(_buffer, _length, i - _length);
_length = i;
}

//...
}

private bool EnsureCapacity(int value) {
// Check for overflow
if (value < 0)
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
if (value > _capacity) {
int newCapacity = value;
if (newCapacity < 256)
newCapacity = 256;
if (newCapacity < _capacity * 2)
newCapacity = _capacity * 2;
Capacity = newCapacity;
return true;
}
return false;
}

public virtual int Capacity
{
//...

set {
//...

// MemoryStream has this invariant: _origin > 0 => !expandable (see ctors)
if (_expandable && value != _capacity) {
if (value > 0) {
byte[] newBuffer = new byte[value];
if (_length > 0) Buffer.InternalBlockCopy(_buffer, 0, newBuffer, 0, _length);
_buffer = newBuffer;
}
else {
_buffer = null;
}
_capacity = value;
}
}
}

因此,每次达到容量限制时,容量都会增加一倍。这样做的原因是 Buffer.InternalBlockCopy 操作对于大型数组来说很慢,因此如果必须频繁调整每个 Write 调用的大小,性能会显着下降。

您可以做一些事情来提高性能,您可以将初始容量设置为至少是压缩数组的大小,然后您可以将大小增加小于 2.0 以减少您正在使用的内存量。

const double ResizeFactor = 1.25;

private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream(data.Length * ResizeFactor)) //Set the initial size to be the same as the compressed size + 25%.
{
byte[] buffer = new byte[4096];
int iCount = 0;

while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
if(resultStream.Capacity < resultStream.Length + iCount)
resultStream.Capacity = resultStream.Capacity * ResizeFactor; //Resize to 125% instead of 200%

resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}

如果你愿意,你可以做更多花哨的算法,比如根据当前压缩比调整大小

const double MinResizeFactor = 1.05;

private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream(data.Length * MinResizeFactor)) //Set the initial size to be the same as the compressed size + the minimum resize factor.
{
byte[] buffer = new byte[4096];
int iCount = 0;

while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
if(resultStream.Capacity < resultStream.Length + iCount)
{
double sizeRatio = ((double)resultStream.Position + iCount) / (compressedStream.Position + 1); //The +1 is to prevent divide by 0 errors, it may not be necessary in practice.

//Resize to minimum resize factor of the current capacity or the
// compressed stream length times the compression ratio + min resize
// factor, whichever is larger.
resultStream.Capacity = Math.Max(resultStream.Capacity * MinResizeFactor,
(sizeRatio + (MinResizeFactor - 1)) * compressedStream.Length);
}

resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}

关于c# - 为什么 C# 内存流要预留这么多内存?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24636259/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com