gpt4 book ai didi

c# - 为什么运算符比方法调用慢得多? (结构仅在较旧的 JIT 上较慢)

转载 作者:IT王子 更新时间:2023-10-29 03:38:27 30 4
gpt4 key购买 nike

简介:我用 C# 编写高性能代码。是的,我知道 C++ 会给我更好的优化,但我仍然选择使用 C#。我不想辩论那个选择。相反,我想听听像我一样尝试在 .NET Framework 上编写高性能代码的人的意见。

问题:

  • 为什么下面代码中的运算符比等价的运算符慢方法调用??
  • 为什么方法在下面的代码中传递了两个 double 值比传递具有两个结构的等效方法更快 double 里面? (A:较旧的 JIT 优化结构很差)
  • 有没有办法让 .NET JIT 编译器处理简单结构与结构成员一样高效? (A:获得更新的 JIT)

我认为我知道的:最初的 .NET JIT 编译器不会内联任何涉及结构的内容。 Bizarre given structs 只应该用在你需要像内置函数一样优化的小值类型的地方,但这是真的。幸运的是,在 .NET 3.5SP1 和 .NET 2.0SP2 中,他们对 JIT 优化器进行了一些改进,包括对内联的改进,特别是对于结构。 (我猜他们这样做是因为否则他们引入的新 Complex 结构会执行得非常糟糕......所以 Complex 团队可能会抨击 JIT Optimizer 团队。)因此,.NET 3.5 SP1 之前的任何文档都可能是与这个问题不太相关。

我的测试显示:我已经通过检查 C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll 文件确实有版本 >= 3053 来验证我确实有更新的 JIT 优化器,因此应该对 JIT 优化器进行这些改进。然而,即便如此,我的时间安排和对反汇编的看法都是:

JIT 生成的用于传递具有两个 double 的结构的代码的效率远低于直接传递两个 double 的代码。

与将结构作为参数传递相比,JIT 生成的结构方法代码传入“this”的效率要高得多。

如果您传递两个 double 而不是传递一个具有两个 double 的结构,那么 JIT 仍然可以更好地内联,即使由于明显处于循环中而使用乘数也是如此。

时机:实际上,通过反汇编,我意识到循环中的大部分时间只是访问 List 之外的测试数据。如果将循环的开销代码和数据访问因素排除在外,进行相同调用的四种方式之间的区别就会大不相同。通过执行 PlusEqual(double, double) 而不是 PlusEqual(Element),我获得了 5 到 20 倍的加速。执行 PlusEqual(double, double) 而不是运算符 += 是 10 到 40 倍。哇。伤心。

这是一组时间安排:

Populating List<Element> took 320ms.
The PlusEqual() method took 105ms.
The 'same' += operator took 131ms.
The 'same' -= operator took 139ms.
The PlusEqual(double, double) method took 68ms.
The do nothing loop took 66ms.
The ratio of operator with constructor to method is 124%.
The ratio of operator without constructor to method is 132%.
The ratio of PlusEqual(double,double) to PlusEqual(Element) is 64%.
If we remove the overhead time for the loop accessing the elements from the List...
The ratio of operator with constructor to method is 166%.
The ratio of operator without constructor to method is 187%.
The ratio of PlusEqual(double,double) to PlusEqual(Element) is 5%.

代码:

namespace OperatorVsMethod
{
public struct Element
{
public double Left;
public double Right;

public Element(double left, double right)
{
this.Left = left;
this.Right = right;
}

public static Element operator +(Element x, Element y)
{
return new Element(x.Left + y.Left, x.Right + y.Right);
}

public static Element operator -(Element x, Element y)
{
x.Left += y.Left;
x.Right += y.Right;
return x;
}

/// <summary>
/// Like the += operator; but faster.
/// </summary>
public void PlusEqual(Element that)
{
this.Left += that.Left;
this.Right += that.Right;
}

/// <summary>
/// Like the += operator; but faster.
/// </summary>
public void PlusEqual(double thatLeft, double thatRight)
{
this.Left += thatLeft;
this.Right += thatRight;
}
}

[TestClass]
public class UnitTest1
{
[TestMethod]
public void TestMethod1()
{
Stopwatch stopwatch = new Stopwatch();

// Populate a List of Elements to multiply together
int seedSize = 4;
List<double> doubles = new List<double>(seedSize);
doubles.Add(2.5d);
doubles.Add(100000d);
doubles.Add(-0.5d);
doubles.Add(-100002d);

int size = 2500000 * seedSize;
List<Element> elts = new List<Element>(size);

stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
int di = ii % seedSize;
double d = doubles[di];
elts.Add(new Element(d, d));
}
stopwatch.Stop();
long populateMS = stopwatch.ElapsedMilliseconds;

// Measure speed of += operator (calls ctor)
Element operatorCtorResult = new Element(1d, 1d);
stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
operatorCtorResult += elts[ii];
}
stopwatch.Stop();
long operatorCtorMS = stopwatch.ElapsedMilliseconds;

// Measure speed of -= operator (+= without ctor)
Element operatorNoCtorResult = new Element(1d, 1d);
stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
operatorNoCtorResult -= elts[ii];
}
stopwatch.Stop();
long operatorNoCtorMS = stopwatch.ElapsedMilliseconds;

// Measure speed of PlusEqual(Element) method
Element plusEqualResult = new Element(1d, 1d);
stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
plusEqualResult.PlusEqual(elts[ii]);
}
stopwatch.Stop();
long plusEqualMS = stopwatch.ElapsedMilliseconds;

// Measure speed of PlusEqual(double, double) method
Element plusEqualDDResult = new Element(1d, 1d);
stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
Element elt = elts[ii];
plusEqualDDResult.PlusEqual(elt.Left, elt.Right);
}
stopwatch.Stop();
long plusEqualDDMS = stopwatch.ElapsedMilliseconds;

// Measure speed of doing nothing but accessing the Element
Element doNothingResult = new Element(1d, 1d);
stopwatch.Reset();
stopwatch.Start();
for (int ii = 0; ii < size; ++ii)
{
Element elt = elts[ii];
double left = elt.Left;
double right = elt.Right;
}
stopwatch.Stop();
long doNothingMS = stopwatch.ElapsedMilliseconds;

// Report results
Assert.AreEqual(1d, operatorCtorResult.Left, "The operator += did not compute the right result!");
Assert.AreEqual(1d, operatorNoCtorResult.Left, "The operator += did not compute the right result!");
Assert.AreEqual(1d, plusEqualResult.Left, "The operator += did not compute the right result!");
Assert.AreEqual(1d, plusEqualDDResult.Left, "The operator += did not compute the right result!");
Assert.AreEqual(1d, doNothingResult.Left, "The operator += did not compute the right result!");

// Report speeds
Console.WriteLine("Populating List<Element> took {0}ms.", populateMS);
Console.WriteLine("The PlusEqual() method took {0}ms.", plusEqualMS);
Console.WriteLine("The 'same' += operator took {0}ms.", operatorCtorMS);
Console.WriteLine("The 'same' -= operator took {0}ms.", operatorNoCtorMS);
Console.WriteLine("The PlusEqual(double, double) method took {0}ms.", plusEqualDDMS);
Console.WriteLine("The do nothing loop took {0}ms.", doNothingMS);

// Compare speeds
long percentageRatio = 100L * operatorCtorMS / plusEqualMS;
Console.WriteLine("The ratio of operator with constructor to method is {0}%.", percentageRatio);
percentageRatio = 100L * operatorNoCtorMS / plusEqualMS;
Console.WriteLine("The ratio of operator without constructor to method is {0}%.", percentageRatio);
percentageRatio = 100L * plusEqualDDMS / plusEqualMS;
Console.WriteLine("The ratio of PlusEqual(double,double) to PlusEqual(Element) is {0}%.", percentageRatio);

operatorCtorMS -= doNothingMS;
operatorNoCtorMS -= doNothingMS;
plusEqualMS -= doNothingMS;
plusEqualDDMS -= doNothingMS;
Console.WriteLine("If we remove the overhead time for the loop accessing the elements from the List...");
percentageRatio = 100L * operatorCtorMS / plusEqualMS;
Console.WriteLine("The ratio of operator with constructor to method is {0}%.", percentageRatio);
percentageRatio = 100L * operatorNoCtorMS / plusEqualMS;
Console.WriteLine("The ratio of operator without constructor to method is {0}%.", percentageRatio);
percentageRatio = 100L * plusEqualDDMS / plusEqualMS;
Console.WriteLine("The ratio of PlusEqual(double,double) to PlusEqual(Element) is {0}%.", percentageRatio);
}
}
}

IL:(也就是上面的一些被编译成的)

public void PlusEqual(Element that)
{
00000000 push ebp
00000001 mov ebp,esp
00000003 push edi
00000004 push esi
00000005 push ebx
00000006 sub esp,30h
00000009 xor eax,eax
0000000b mov dword ptr [ebp-10h],eax
0000000e xor eax,eax
00000010 mov dword ptr [ebp-1Ch],eax
00000013 mov dword ptr [ebp-3Ch],ecx
00000016 cmp dword ptr ds:[04C87B7Ch],0
0000001d je 00000024
0000001f call 753081B1
00000024 nop
this.Left += that.Left;
00000025 mov eax,dword ptr [ebp-3Ch]
00000028 fld qword ptr [ebp+8]
0000002b fadd qword ptr [eax]
0000002d fstp qword ptr [eax]
this.Right += that.Right;
0000002f mov eax,dword ptr [ebp-3Ch]
00000032 fld qword ptr [ebp+10h]
00000035 fadd qword ptr [eax+8]
00000038 fstp qword ptr [eax+8]
}
0000003b nop
0000003c lea esp,[ebp-0Ch]
0000003f pop ebx
00000040 pop esi
00000041 pop edi
00000042 pop ebp
00000043 ret 10h
public void PlusEqual(double thatLeft, double thatRight)
{
00000000 push ebp
00000001 mov ebp,esp
00000003 push edi
00000004 push esi
00000005 push ebx
00000006 sub esp,30h
00000009 xor eax,eax
0000000b mov dword ptr [ebp-10h],eax
0000000e xor eax,eax
00000010 mov dword ptr [ebp-1Ch],eax
00000013 mov dword ptr [ebp-3Ch],ecx
00000016 cmp dword ptr ds:[04C87B7Ch],0
0000001d je 00000024
0000001f call 75308159
00000024 nop
this.Left += thatLeft;
00000025 mov eax,dword ptr [ebp-3Ch]
00000028 fld qword ptr [ebp+10h]
0000002b fadd qword ptr [eax]
0000002d fstp qword ptr [eax]
this.Right += thatRight;
0000002f mov eax,dword ptr [ebp-3Ch]
00000032 fld qword ptr [ebp+8]
00000035 fadd qword ptr [eax+8]
00000038 fstp qword ptr [eax+8]
}
0000003b nop
0000003c lea esp,[ebp-0Ch]
0000003f pop ebx
00000040 pop esi
00000041 pop edi
00000042 pop ebp
00000043 ret 10h

最佳答案

我得到了非常不同的结果,更不那么引人注目了。但没有使用测试运行器,我将代码粘贴到控制台模式应用程序中。当我尝试时,5% 的结果在 32 位模式下约为 87%,在 64 位模式下约为 100%。

对齐对于 double 至关重要,.NET 运行时只能 promise 在 32 位机器上对齐 4。在我看来,测试运行程序正在使用与 4 而不是 8 对齐的堆栈地址开始测试方法。当 double 跨越高速缓存行边界时,未对齐惩罚变得非常大。

关于c# - 为什么运算符比方法调用慢得多? (结构仅在较旧的 JIT 上较慢),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7615959/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com