gpt4 book ai didi

ios - 苹果 Metal 矩阵乘法基准测试结果不一致

转载 作者:塔克拉玛干 更新时间:2023-11-02 08:32:51 25 4
gpt4 key购买 nike

我在这里尝试 Apple Metal 矩阵乘法示例: https://developer.apple.com/library/ios/samplecode/MetalPartialSumsCompute/Introduction/Intro.html

我得到奇怪的结果:对于测试 [1]-[7],我得到 Metal 以大约 0.05 GFlops 的速度运行。从测试 [8]-[20] 中,Metal 开始以大约 500 GFlops 的速度非常快。我在下面附上日志。我查看了代码,测试之间没有什么不同,它们都是大小相似的随机矩阵。看起来 Metal 在某些时候开始无缘无故地快速运行。知道发生了什么事吗?

日志:

2016-06-30 16:13:29.609 MetalMatrixMultiplication-iOS[3459:742844] >> [1] Matrix Dimensions: A = [841 x 2012], B = [2012 x 554], C = [841 x 554], lda = 848, ldb = 560, ldc = 560
>> [1] Accelerate 6.934929 gflops/sec, Metal 0.044756 gflops/sec, Accelerate 27.034708 millisecs, Metal 4189.027417 millisecs, Diff 1.369554e-01

2016-06-30 16:13:31.747 MetalMatrixMultiplication-iOS[3459:742844] >> [2] Matrix Dimensions: A = [721 x 432], B = [432 x 1436], C = [721 x 1436], lda = 728, ldb = 1440, ldc = 1440
>> [2] Accelerate 1.405928 gflops/sec, Metal 0.045415 gflops/sec, Accelerate 63.626833 millisecs, Metal 1969.722500 millisecs, Diff 4.248900e-02

2016-06-30 16:13:34.820 MetalMatrixMultiplication-iOS[3459:742844] >> [3] Matrix Dimensions: A = [1362 x 457], B = [457 x 1078], C = [1362 x 1078], lda = 1368, ldb = 1080, ldc = 1080
>> [3] Accelerate 1.754547 gflops/sec, Metal 0.046793 gflops/sec, Accelerate 76.485125 millisecs, Metal 2867.863083 millisecs, Diff 3.673622e-02

2016-06-30 16:13:45.549 MetalMatrixMultiplication-iOS[3459:742844] >> [4] Matrix Dimensions: A = [1783 x 1901], B = [1901 x 1347], C = [1783 x 1347], lda = 1784, ldb = 1352, ldc = 1352
>> [4] Accelerate 6.528442 gflops/sec, Metal 0.091166 gflops/sec, Accelerate 139.869000 millisecs, Metal 10016.091333 millisecs, Diff 5.854867e-02

2016-06-30 16:13:48.912 MetalMatrixMultiplication-iOS[3459:742844] >> [5] Matrix Dimensions: A = [709 x 600], B = [600 x 1683], C = [709 x 1683], lda = 712, ldb = 1688, ldc = 1688
>> [5] Accelerate 2.629253 gflops/sec, Metal 0.045250 gflops/sec, Accelerate 54.460208 millisecs, Metal 3164.426333 millisecs, Diff 4.654048e-02

2016-06-30 16:13:57.534 MetalMatrixMultiplication-iOS[3459:742844] >> [6] Matrix Dimensions: A = [636 x 1573], B = [1573 x 1942], C = [636 x 1942], lda = 640, ldb = 1944, ldc = 1944
>> [6] Accelerate 7.106906 gflops/sec, Metal 0.047387 gflops/sec, Accelerate 54.674458 millisecs, Metal 8199.887292 millisecs, Diff 7.446345e-02

2016-06-30 16:14:10.669 MetalMatrixMultiplication-iOS[3459:742844] >> [7] Matrix Dimensions: A = [1803 x 1689], B = [1689 x 1950], C = [1803 x 1950], lda = 1808, ldb = 1952, ldc = 1952
>> [7] Accelerate 6.759199 gflops/sec, Metal 0.096267 gflops/sec, Accelerate 175.709292 millisecs, Metal 12337.145375 millisecs, Diff 4.568898e-02

2016-06-30 16:14:10.878 MetalMatrixMultiplication-iOS[3459:742844] >> [8] Matrix Dimensions: A = [416 x 749], B = [749 x 2034], C = [416 x 2034], lda = 416, ldb = 2040, ldc = 2040
>> [8] Accelerate 3.589321 gflops/sec, Metal 220.343105 gflops/sec, Accelerate 35.313750 millisecs, Metal 0.575250 millisecs, Diff 0.000000e+00

2016-06-30 16:14:11.003 MetalMatrixMultiplication-iOS[3459:742844] >> [9] Matrix Dimensions: A = [657 x 716], B = [716 x 734], C = [657 x 734], lda = 664, ldb = 736, ldc = 736
>> [9] Accelerate 2.946337 gflops/sec, Metal 102.394388 gflops/sec, Accelerate 23.438083 millisecs, Metal 0.674417 millisecs, Diff 0.000000e+00

2016-06-30 16:14:11.124 MetalMatrixMultiplication-iOS[3459:742844] >> [10] Matrix Dimensions: A = [446 x 945], B = [945 x 707], C = [446 x 707], lda = 448, ldb = 712, ldc = 712
>> [10] Accelerate 3.426099 gflops/sec, Metal 94.259957 gflops/sec, Accelerate 17.394667 millisecs, Metal 0.632250 millisecs, Diff 0.000000e+00

2016-06-30 16:14:11.533 MetalMatrixMultiplication-iOS[3459:742844] >> [11] Matrix Dimensions: A = [935 x 1286], B = [1286 x 1899], C = [935 x 1899], lda = 936, ldb = 1904, ldc = 1904
>> [11] Accelerate 6.185983 gflops/sec, Metal 441.997324 gflops/sec, Accelerate 73.824208 millisecs, Metal 1.033208 millisecs, Diff 0.000000e+00

2016-06-30 16:14:11.685 MetalMatrixMultiplication-iOS[3459:742844] >> [12] Matrix Dimensions: A = [541 x 956], B = [956 x 960], C = [541 x 960], lda = 544, ldb = 960, ldc = 960
>> [12] Accelerate 3.805037 gflops/sec, Metal 153.253113 gflops/sec, Accelerate 26.097417 millisecs, Metal 0.647958 millisecs, Diff 0.000000e+00

2016-06-30 16:14:12.007 MetalMatrixMultiplication-iOS[3459:742844] >> [13] Matrix Dimensions: A = [1278 x 1809], B = [1809 x 500], C = [1278 x 500], lda = 1280, ldb = 504, ldc = 504
>> [13] Accelerate 7.661287 gflops/sec, Metal 343.033372 gflops/sec, Accelerate 30.176417 millisecs, Metal 0.673958 millisecs, Diff 0.000000e+00

2016-06-30 16:14:12.456 MetalMatrixMultiplication-iOS[3459:742844] >> [14] Matrix Dimensions: A = [1933 x 1534], B = [1534 x 805], C = [1933 x 805], lda = 1936, ldb = 808, ldc = 808
>> [14] Accelerate 7.221810 gflops/sec, Metal 696.681127 gflops/sec, Accelerate 66.105417 millisecs, Metal 0.685250 millisecs, Diff 0.000000e+00

2016-06-30 16:14:12.552 MetalMatrixMultiplication-iOS[3459:742844] >> [15] Matrix Dimensions: A = [291 x 645], B = [645 x 1034], C = [291 x 1034], lda = 296, ldb = 1040, ldc = 1040
>> [15] Accelerate 2.155479 gflops/sec, Metal 62.162540 gflops/sec, Accelerate 18.007750 millisecs, Metal 0.624417 millisecs, Diff 0.000000e+00

2016-06-30 16:14:12.940 MetalMatrixMultiplication-iOS[3459:742844] >> [16] Matrix Dimensions: A = [1656 x 1547], B = [1547 x 781], C = [1656 x 781], lda = 1656, ldb = 784, ldc = 784
>> [16] Accelerate 7.341706 gflops/sec, Metal 424.495925 gflops/sec, Accelerate 54.504792 millisecs, Metal 0.942667 millisecs, Diff 0.000000e+00

2016-06-30 16:14:13.425 MetalMatrixMultiplication-iOS[3459:742844] >> [17] Matrix Dimensions: A = [1651 x 1320], B = [1320 x 1429], C = [1651 x 1429], lda = 1656, ldb = 1432, ldc = 1432
>> [17] Accelerate 6.615108 gflops/sec, Metal 1001.902932 gflops/sec, Accelerate 94.155625 millisecs, Metal 0.621667 millisecs, Diff 0.000000e+00

2016-06-30 16:14:13.757 MetalMatrixMultiplication-iOS[3459:742844] >> [18] Matrix Dimensions: A = [2037 x 384], B = [384 x 1615], C = [2037 x 1615], lda = 2040, ldb = 1616, ldc = 1616
>> [18] Accelerate 1.737157 gflops/sec, Metal 331.366545 gflops/sec, Accelerate 145.440583 millisecs, Metal 0.762458 millisecs, Diff 0.000000e+00

2016-06-30 16:14:13.923 MetalMatrixMultiplication-iOS[3459:742844] >> [19] Matrix Dimensions: A = [795 x 677], B = [677 x 1145], C = [795 x 1145], lda = 800, ldb = 1152, ldc = 1152
>> [19] Accelerate 3.405232 gflops/sec, Metal 192.017503 gflops/sec, Accelerate 36.194667 millisecs, Metal 0.641875 millisecs, Diff 0.000000e+00

2016-06-30 16:14:14.033 MetalMatrixMultiplication-iOS[3459:742844] >> [20] Matrix Dimensions: A = [1062 x 438], B = [438 x 678], C = [1062 x 678], lda = 1064, ldb = 680, ldc = 680
>> [20] Accelerate 2.090133 gflops/sec, Metal 98.388385 gflops/sec, Accelerate 30.177583 millisecs, Metal 0.641083 millisecs, Diff 0.000000e+00

最佳答案

发生的事情是操作失败,但演示代码不检查状态,因此看起来运行得更快。

如果你添加这个 block

if (m_CmdBuffer.status == MTLCommandBufferStatusError) {  
NSLog(@"Error occured when executing command buffer");
NSLog(@"Error code: %@", mCmdBuffer.error);
}

在 MetalMatrixMult 完成方法(MetalMatrixMult.mm 第 513 行)的末尾,您将看到错误何时发生。

它首先失败:错误代码:

Error Domain=MTLCommandBufferErrorDomain Code=2 "导致 GPU 超时错误(IOAF 代码 2)"UserInfo={NSLocalizedDescription=导致 GPU 超时错误(IOAF 代码 2)}

然后,在它报告了几个之后:

错误代码:Error Domain=MTLCommandBufferErrorDomain Code=4 "Ignored (for causing prior/excessive GPU errors) (IOAF code 4)"UserInfo={NSLocalizedDescription=Ignored (for causing prior/excessive GPU errors) (IOAF code 4)

我注意到 iOS 9 上 Metal 的另一件事是,当 GPU 帧捕获和 Metal API 验证打开时(编辑方案 -> 选项选项卡)似乎存在内存管理错误。就好像在这种模式下运行时 Metal 缓冲区没有被释放。

关于ios - 苹果 Metal 矩阵乘法基准测试结果不一致,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38131606/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com