gpt4 book ai didi

c++ - 展示亲和性设置效果的最佳方式是什么?

转载 作者:可可西里 更新时间:2023-11-01 14:15:40 28 4
gpt4 key购买 nike

一旦我注意到 Windows 不会在特定核心上保留计算密集型线程 -它一直在切换核心。所以我推测这项工作会完成得更快,如果该线程将继续访问相同的数据缓存。真的,我能够观察到将线程的关联掩码设置为单核后,速度稳定提高 ~1%(在 ppmd(减压)压缩线程中)。但后来我尝试为这种效果构建一个简单的演示,但或多或​​少失败了——也就是说,它在我的系统 (Q9450) 上按预期工作:

buflog=21 bufsize=2097152(cache flush) first run    = 6.938stime with default affinity = 6.782stime with first core only  = 6.578sspeed gain is 3.01%

but people I asked weren't exactly able to reproduce the effect.Any suggestions?

#include <stdio.h>
#include <windows.h>
int buflog=21, bufsize, bufmask;
char* a;
char* b;
volatile int r = 0;
__declspec(noinline)
int benchmark( char* a ) {
int t0 = GetTickCount();
int i,h=1,s=0;
for( i=0; i<1000000000; i++ ) {
h = h*200002979 + 1;
s += ((int&)a[h&bufmask]) + ((int&)a[h&(bufmask>>2)]) + ((int&)a[h&(bufmask>>4)]);
} r = s;
t0 = GetTickCount() - t0;
return t0;
}
DWORD WINAPI loadcore( LPVOID ) {
SetThreadAffinityMask( GetCurrentThread(), 2 );
while(1) benchmark(b);
}
int main( int argc, char** argv ) {
if( (argc>1) && (atoi(argv[1])>16) ) buflog=atoi(argv[1]);
bufsize=1<<buflog; bufmask=bufsize-1;
a = new char[bufsize+4];
b = new char[bufsize+4];
printf( "buflog=%i bufsize=%i\n", buflog, bufsize );
CreateThread( 0, 0, &loadcore, 0, 0, 0 );
printf( "(cache flush) first run = %.3fs\n", float(benchmark(a))/1000 );
float t1 = benchmark(a); t1/=1000;
printf( "time with default affinity = %.3fs\n", t1 );
SetThreadAffinityMask( GetCurrentThread(), 1 );
float t2 = benchmark(a); t2/=1000;
printf( "time with first core only = %.3fs\n", t2 );
printf( "speed gain is %4.2f%%\n", (t1-t2)*100/t1 );
return 0;
}

附言如果有人需要,我可以发布编译版本的链接。

最佳答案

默认亲和性: default affinity
(来源:dreamhosters.com)

亲和性设置为核心#4 affinity set to core #4
(来源:dreamhosters.com)

现在,这是一个归档器。你真的认为工作线程会cpu周围一切正常吗?

关于c++ - 展示亲和性设置效果的最佳方式是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3280197/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com