gpt4 book ai didi

macos - GPU负载下CGEventPost的性能较弱

转载 作者:行者123 更新时间:2023-12-03 06:00:35 28 4
gpt4 key购买 nike

我们偶然发现了 Quartz Events 的性能问题,更具体地说是 CGEventPost:在 GPU 负载较重时,CGEventPost 可能会阻塞。我们创建了a small benchmark application to demonstrate the issue 。该应用程序只是一个创建、发布和发布事件的循环。

下面您可以看到运行应用程序的结果。第一次运行是在空闲系统上。第二次运行是使用 FurMark(GPU 压力测试),尽可能调高刻度盘。

  • Inner 是内部循环花费的时间,基本上只是使用 Quartz Events 创建、发布和释放事件。
  • Outer 是我们的程序等待被唤醒( sleep )的时间。应该接近我们 sleep 的时间,但如果系统处于压力下,它可能会延迟。
  • 发布是指事件发布所需的时间。

 

18:58:01.683 EventPerformance[4946:707] Measurements: (outer should be close to 10)
18:58:01.684 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.684 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.685 EventPerformance[4946:707] inner (ms): 0.07, outer (ms): 10.26, CGEventPost (ms): 0.03
18:58:01.685 EventPerformance[4946:707] inner (ms): 0.06, outer (ms): 10.85, CGEventPost (ms): 0.05
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.07, outer (ms): 10.41, CGEventPost (ms): 0.04
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.04, outer (ms): 10.39, CGEventPost (ms): 0.03
18:58:01.686 EventPerformance[4946:707] inner (ms): 0.05, outer (ms): 11.02, CGEventPost (ms): 0.03
18:58:01.687 EventPerformance[4946:707] inner (ms): 0.03, outer (ms): 10.67, CGEventPost (ms): 0.03
18:58:01.687 EventPerformance[4946:707] inner (ms): 0.08, outer (ms): 10.09, CGEventPost (ms): 0.05
18:58:01.688 EventPerformance[4946:707] Averages: (outer should be close to 10)
18:58:01.688 EventPerformance[4946:707] avg inner (ms): 0.05, avg outer (ms): 10.64, avg post (ms): 0.03

在这里我们可以看到,发布事件平均需要大约 0.03 毫秒。而且线程似乎被唤醒了大约 0.5 毫秒,为时已晚。 CGEventPost 中没有峰值。

19:02:02.150 EventPerformance[5241:707] Measurements: (outer should be close to 10)
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.23, CGEventPost (ms): 0.02
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.54, CGEventPost (ms): 0.02
19:02:02.151 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 11.01, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.74, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.02, outer (ms): 10.20, CGEventPost (ms): 0.01
19:02:02.152 EventPerformance[5241:707] inner (ms): 10.35, outer (ms): 11.01, CGEventPost (ms): 10.35
19:02:02.152 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.02, CGEventPost (ms): 0.02
19:02:02.153 EventPerformance[5241:707] inner (ms): 58.90, outer (ms): 10.11, CGEventPost (ms): 58.90
19:02:02.153 EventPerformance[5241:707] inner (ms): 0.03, outer (ms): 10.12, CGEventPost (ms): 0.02
19:02:02.153 EventPerformance[5241:707] Averages: (outer should be close to 10)
19:02:02.371 EventPerformance[5241:707] avg inner (ms): 7.71, avg outer (ms): 10.44, avg post (ms): 7.71

当系统处于繁重的 GPU 负载下时,发布事件可能需要(峰值)毫秒而不是微秒。在极端 GPU 压力下 (< 1 FPS),该值可能需要几秒钟的时间。 CGEventPost有时似乎在返回之前等待 GPU 完成一些工作。我们的线程仍然正常调度,没有明显的延迟/峰值(外部)。

任何想法都会受到赞赏。

最佳答案

我猜你正在填满队列(底层马赫端口)...

您可以使用仪器中的"dispatch"或“系统调用”仪器来确认这一点。 (创建一个新的空白文档,添加仪器,然后在 File > Record Options... 下确保选中“延迟模式”。)这将显示应用程序中的所有线程事件(线程何时阻塞、何时休眠、何时激活以及为什么)。

我首先尝试提高调用 man 3 PTHREAD_SCHEDPARAM 的线程的线程优先级(参见 CGEventPost ) 。如果您的线程在优先级较低的线程上被阻塞,内核应暂时提升阻塞线程的优先级,以避免优先级反转并帮助您的任务更早完成。

总的来说,我认为您必须实现2线程解决方案,如下所示:

为您要发布的事件创建一个队列。从主线程(或事件发布线程)将事件发布到此队列,然后向第二个线程(您创建的事件使用者线程)发送信号以遍历队列并使用 CGEventPost 发布任何未完成的事件。 .

何时 CGEventPost阻塞,您的第二个事件发布线程将阻塞,但这不会阻塞任何其他线程。当CGEventPost最终解除阻塞,它将消耗事件使用者线程发布的任何未完成的事件,并且事件使用者线程可以恢复发布事件。

另一种可能性:你能合并事件吗?您可以将某些类型的事件(鼠标移动?)合并为更少的事件。您可能仍会遇到队列限制 CGEventPost有时,我认为 2 线程方法可能是您最好的选择。

关于macos - GPU负载下CGEventPost的性能较弱,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14342444/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com