- android - RelativeLayout 背景可绘制重叠内容
- android - 如何链接 cpufeatures lib 以获取 native android 库?
- java - OnItemClickListener 不起作用,但 OnLongItemClickListener 在自定义 ListView 中起作用
- java - Android 文件转字符串
我一直无法理解使用 Func<...>
的性能特征在使用继承和泛型时贯穿我的代码——这是我发现自己一直在使用的组合。
让我从一个最小的测试用例开始,以便我们都知道我们在谈论什么,然后我将发布结果,然后我将解释我的期望以及原因...
最小测试用例
public class GenericsTest2 : GenericsTest<int>
{
static void Main(string[] args)
{
GenericsTest2 at = new GenericsTest2();
at.test(at.func);
at.test(at.Check);
at.test(at.func2);
at.test(at.Check2);
at.test((a) => a.Equals(default(int)));
Console.ReadLine();
}
public GenericsTest2()
{
func = func2 = (a) => Check(a);
}
protected Func<int, bool> func2;
public bool Check2(int value)
{
return value.Equals(default(int));
}
public void test(Func<int, bool> func)
{
using (Stopwatch sw = new Stopwatch((ts) => { Console.WriteLine("Took {0:0.00}s", ts.TotalSeconds); }))
{
for (int i = 0; i < 100000000; ++i)
{
func(i);
}
}
}
}
public class GenericsTest<T>
{
public bool Check(T value)
{
return value.Equals(default(T));
}
protected Func<T, bool> func;
}
public class Stopwatch : IDisposable
{
public Stopwatch(Action<TimeSpan> act)
{
this.act = act;
this.start = DateTime.UtcNow;
}
private Action<TimeSpan> act;
private DateTime start;
public void Dispose()
{
act(DateTime.UtcNow.Subtract(start));
}
}
结果
Took 2.50s -> at.test(at.func);
Took 1.97s -> at.test(at.Check);
Took 2.48s -> at.test(at.func2);
Took 0.72s -> at.test(at.Check2);
Took 0.81s -> at.test((a) => a.Equals(default(int)));
我的期望和原因
我希望这段代码对于所有 5 种方法都以完全相同的速度运行,更准确地说,甚至比任何一种方法都更快,即与以下速度一样快:
using (Stopwatch sw = new Stopwatch((ts) => { Console.WriteLine("Took {0:0.00}s", ts.TotalSeconds); }))
{
for (int i = 0; i < 100000000; ++i)
{
bool b = i.Equals(default(int));
}
}
// this takes 0.32s ?!?
我预计它需要 0.32 秒,因为我看不出 JIT 编译器有任何理由不在这种特殊情况下内联代码。
仔细检查后,我根本不理解这些性能数字:
at.func
传递给函数并且在执行期间不能更改。为什么这不是内联的?at.Check
显然比 at.Check2
快, 而两者都不能被覆盖,并且 at.Check 的 IL 在类 GenericsTest2 的情况下是固定不变的Func<int, bool>
通过内联时变慢 Func
而不是转换为 Func
的方法问题
我真的很想了解这一点……这里发生了什么,使用通用基类比内联整个类慢 10 倍之多?
所以,基本上问题是:为什么会发生这种情况,我该如何解决?
更新
根据目前的所有评论(谢谢!)我做了更多的挖掘工作。
首先,在重复测试并将循环扩大 5 倍并执行 4 次时得到一组新结果。我使用了诊断秒表并添加了更多测试(也添加了描述)。
(Baseline implementation took 2.61s)
--- Run 0 ---
Took 3.00s for (a) => at.Check2(a)
Took 12.04s for Check3<int>
Took 12.51s for (a) => GenericsTest2.Check(a)
Took 13.74s for at.func
Took 16.07s for GenericsTest2.Check
Took 12.99s for at.func2
Took 1.47s for at.Check2
Took 2.31s for (a) => a.Equals(default(int))
--- Run 1 ---
Took 3.18s for (a) => at.Check2(a)
Took 13.29s for Check3<int>
Took 14.10s for (a) => GenericsTest2.Check(a)
Took 13.54s for at.func
Took 13.48s for GenericsTest2.Check
Took 13.89s for at.func2
Took 1.94s for at.Check2
Took 2.61s for (a) => a.Equals(default(int))
--- Run 2 ---
Took 3.18s for (a) => at.Check2(a)
Took 12.91s for Check3<int>
Took 15.20s for (a) => GenericsTest2.Check(a)
Took 12.90s for at.func
Took 13.79s for GenericsTest2.Check
Took 14.52s for at.func2
Took 2.02s for at.Check2
Took 2.67s for (a) => a.Equals(default(int))
--- Run 3 ---
Took 3.17s for (a) => at.Check2(a)
Took 12.69s for Check3<int>
Took 13.58s for (a) => GenericsTest2.Check(a)
Took 14.27s for at.func
Took 12.82s for GenericsTest2.Check
Took 14.03s for at.func2
Took 1.32s for at.Check2
Took 1.70s for (a) => a.Equals(default(int))
我从这些结果中注意到,当您开始使用泛型时,它会变得更慢。深入了解我为非泛型实现找到的 IL:
L_0000: ldarga.s 'value'
L_0002: ldc.i4.0
L_0003: call instance bool [mscorlib]System.Int32::Equals(int32)
L_0008: ret
对于所有的通用实现:
L_0000: ldarga.s 'value'
L_0002: ldloca.s CS$0$0000
L_0004: initobj !T
L_000a: ldloc.0
L_000b: box !T
L_0010: constrained. !T
L_0016: callvirt instance bool [mscorlib]System.Object::Equals(object)
L_001b: ret
虽然其中大部分都可以优化,但我认为 callvirt
这里可能是个问题。
为了让它更快,我在方法的定义中添加了“T : IEquatable”约束。结果是:
L_0011: callvirt instance bool [mscorlib]System.IEquatable`1<!T>::Equals(!0)
虽然我现在对性能有了更多了解(它可能无法内联,因为它创建了一个 vtable 查找),但我仍然感到困惑:为什么它不简单地调用 T::Equals?毕竟,我确实指定它会在那里...
最佳答案
始终运行微基准测试 3 次。第一个将触发 JIT 并将其排除。检查第 2 次和第 3 次运行是否相等。这给出:
... run ...
Took 0.79s
Took 0.63s
Took 0.74s
Took 0.24s
Took 0.32s
... run ...
Took 0.73s
Took 0.63s
Took 0.73s
Took 0.24s
Took 0.33s
... run ...
Took 0.74s
Took 0.63s
Took 0.74s
Took 0.25s
Took 0.33s
线
func = func2 = (a) => Check(a);
添加一个额外的函数调用。删除它
func = func2 = this.Check;
给出:
... 1. run ...
Took 0.64s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s
... 2. run ...
Took 0.63s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s
... 3. run ...
Took 0.63s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s
这表明 1. 和 2. run 之间的(JIT?)效果由于删除了函数调用而消失了。 前 3 个测试现在相等。
在测试 4 和 5 中,编译器可以将函数参数内联到 void test(Func<>),而在测试 1 到 3 中,编译器需要很长的路才能弄清楚它们是常量。有时,从我们的编码人员的角度来看,编译器存在一些不容易看到的约束,例如 .Net 和 Jit 约束来自 .Net 程序的动态特性,与由 C++ 生成的二进制文件相比。无论如何,函数 arg 的内联使这里有所不同。
4 和 5 之间的区别?好吧,test5 看起来编译器也可以很容易地内联函数。也许他为闭包构建了一个上下文并解决了比需要更复杂的问题。没有深入研究 MSIL 来弄清楚。
上面使用 .Net 4.5 进行的测试。这里使用 3.5,证明编译器通过内联变得更好:
... 1. run ...
Took 1.06s
Took 1.06s
Took 1.06s
Took 0.24s
Took 0.27s
... 2. run ...
Took 1.06s
Took 1.08s
Took 1.06s
Took 0.25s
Took 0.27s
... 3. run ...
Took 1.05s
Took 1.06s
Took 1.05s
Took 0.24s
Took 0.27s
和.Net 4:
... 1. run ...
Took 0.97s
Took 0.97s
Took 0.96s
Took 0.22s
Took 0.30s
... 2. run ...
Took 0.96s
Took 0.96s
Took 0.96s
Took 0.22s
Took 0.30s
... 3. run ...
Took 0.97s
Took 0.96s
Took 0.96s
Took 0.22s
Took 0.30s
现在将 GenericTest<> 更改为 GenericTest!!
... 1. run ...
Took 0.28s
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.27s
... 2. run ...
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.27s
... 3. run ...
Took 0.25s
Took 0.25s
Took 0.25s
Took 0.24s
Took 0.27s
好吧,这是来自 C# 编译器的一个惊喜,类似于我遇到的密封类以避免虚函数调用的情况。也许 Eric Lippert 对此有话要说?
移除对聚合的继承可以恢复性能。我学会了从不使用继承,很少使用继承,并且强烈建议您至少在这种情况下避免使用它。 (这是我对这个问题的务实解决方案,无意进行争吵)。我一直严格使用接口(interface),它们没有性能损失。
关于c# - Func<T> 的性能和继承,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15669358/
#include using namespace std; class C{ private: int value; public: C(){ value = 0;
这个问题已经有答案了: What is the difference between char a[] = ?string?; and char *p = ?string?;? (8 个回答) 已关闭
关闭。此题需要details or clarity 。目前不接受答案。 想要改进这个问题吗?通过 editing this post 添加详细信息并澄清问题. 已关闭 7 年前。 此帖子已于 8 个月
除了调试之外,是否有任何针对 c、c++ 或 c# 的测试工具,其工作原理类似于将独立函数复制粘贴到某个文本框,然后在其他文本框中输入参数? 最佳答案 也许您会考虑单元测试。我推荐你谷歌测试和谷歌模拟
我想在第二台显示器中移动一个窗口 (HWND)。问题是我尝试了很多方法,例如将分辨率加倍或输入负值,但它永远无法将窗口放在我的第二台显示器上。 关于如何在 C/C++/c# 中执行此操作的任何线索 最
我正在寻找 C/C++/C## 中不同类型 DES 的现有实现。我的运行平台是Windows XP/Vista/7。 我正在尝试编写一个 C# 程序,它将使用 DES 算法进行加密和解密。我需要一些实
很难说出这里要问什么。这个问题模棱两可、含糊不清、不完整、过于宽泛或夸夸其谈,无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开,visit the help center . 关闭 1
有没有办法强制将另一个 窗口置于顶部? 不是应用程序的窗口,而是另一个已经在系统上运行的窗口。 (Windows, C/C++/C#) 最佳答案 SetWindowPos(that_window_ha
假设您可以在 C/C++ 或 Csharp 之间做出选择,并且您打算在 Windows 和 Linux 服务器上运行同一服务器的多个实例,那么构建套接字服务器应用程序的最明智选择是什么? 最佳答案 如
你们能告诉我它们之间的区别吗? 顺便问一下,有什么叫C++库或C库的吗? 最佳答案 C++ 标准库 和 C 标准库 是 C++ 和 C 标准定义的库,提供给 C++ 和 C 程序使用。那是那些词的共同
下面的测试代码,我将输出信息放在注释中。我使用的是 gcc 4.8.5 和 Centos 7.2。 #include #include class C { public:
很难说出这里问的是什么。这个问题是含糊的、模糊的、不完整的、过于宽泛的或修辞性的,无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开它,visit the help center 。 已关
我的客户将使用名为 annoucement 的结构/类与客户通信。我想我会用 C++ 编写服务器。会有很多不同的类继承annoucement。我的问题是通过网络将这些类发送给客户端 我想也许我应该使用
我在 C# 中有以下函数: public Matrix ConcatDescriptors(IList> descriptors) { int cols = descriptors[0].Co
我有一个项目要编写一个函数来对某些数据执行某些操作。我可以用 C/C++ 编写代码,但我不想与雇主共享该函数的代码。相反,我只想让他有权在他自己的代码中调用该函数。是否可以?我想到了这两种方法 - 在
我使用的是编写糟糕的第 3 方 (C/C++) Api。我从托管代码(C++/CLI)中使用它。有时会出现“访问冲突错误”。这使整个应用程序崩溃。我知道我无法处理这些错误[如果指针访问非法内存位置等,
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。 我们不允许提问寻求书籍、工具、软件库等的推荐。您可以编辑问题,以便用事实和引用来回答。 关闭 7 年前。
已关闭。此问题不符合Stack Overflow guidelines 。目前不接受答案。 要求我们推荐或查找工具、库或最喜欢的场外资源的问题对于 Stack Overflow 来说是偏离主题的,因为
我有一些 C 代码,将使用 P/Invoke 从 C# 调用。我正在尝试为这个 C 函数定义一个 C# 等效项。 SomeData* DoSomething(); struct SomeData {
这个问题已经有答案了: Why are these constructs using pre and post-increment undefined behavior? (14 个回答) 已关闭 6
我是一名优秀的程序员,十分优秀!