c++ - 繁忙循环与 sleep (0)和暂停指令有什么不同？

转载作者：行者123 更新时间：2023-12-02 00:04:52

26

4

我想等待我的应用程序中应该立即发生的事件，所以我不想让我的线程等待并稍后唤醒它。我想知道使用 Sleep(0) 和硬件暂停指令有什么区别。

我看不到以下程序的 CPU 利用率有任何差异。我的问题不是关于节能的考虑。

#include <iostream>
using namespace std;
#include <windows.h>

bool t = false;
int main() {
       while(t == false)
       {
              __asm { pause } ;
              //Sleep(0);
       }
}

最佳答案

Windows sleep (0) 与 PAUSE 指令

让我引用《Intel 64 和 IA-32 架构优化引用手册》。

In multi-threading implementation, a popular construct in thread synchronization and for yielding scheduling quanta to another thread waiting to carry out its task is to sit in a loop and issuing SLEEP(0).

These are typically called “sleep loops” (see example #1). It should be noted that a SwitchToThread call can also be used. The “sleep loop” is common in locking algorithms and thread pools as the threads are waiting on work.

This construct of sitting in a tight loop and calling Sleep() service with a parameter of 0 is actually a polling loop with side effects:

Each call to Sleep() experiences the expensive cost of a context switch, which can be 10000+ cycles.

It also suffers the cost of ring 3 to ring 0 transitions, which can be 1000+ cycles.

When there is no other thread waiting to take possession of control, this sleep loop behaves to the OS as a highly active task demanding CPU resource, preventing the OS to put the CPU into a low-power state.

示例#1。未优化的 sleep 循环

while(!acquire_lock())
{ Sleep( 0 ); }
do_work();
release_lock();

示例#2。使用 PAUSE 的功耗友好型 sleep 循环

if (!acquire_lock())
{ /* Spin on pause max_spin_count times before backing off to sleep */
    for(int j = 0; j < max_spin_count; ++j)
    { /* intrinsic for PAUSE instruction*/
        _mm_pause();
        if (read_volatile_lock())
        {
            if (acquire_lock()) goto PROTECTED_CODE;
        }
    }
    /* Pause loop didn't work, sleep now */
    Sleep(0);
    goto ATTEMPT_AGAIN;
}
PROTECTED_CODE:
do_work();
release_lock();

Example #2 shows the technique of using PAUSE instruction to make the sleep loop power friendly.

By slowing down the “spin-wait” with the PAUSE instruction, the multi-threading software gains:

Performance by facilitating the waiting tasks to acquire resources more easily from a busy wait.

Power-savings by both using fewer parts of the pipeline while spinning.

Elimination of great majority of unnecessarily executed instructions caused by the overhead of a Sleep(0) call.

In one case study, this technique achieved 4.3x of performance gain, which translated to 21% power savings at the processor and 13% power savings at platform level.

Skylake 微架构中的暂停延迟

The PAUSE instruction is typically used with software threads executing on two logical processors located in the same processor core, waiting for a lock to be released. Such short wait loops tend to last between tens and a few hundreds of cycles, so performance-wise it is more beneficial to wait while occupying the CPU than yielding to the OS. When the wait loop is expected to last for thousands of cycles or more, it is preferable to yield to the operating system by calling one of the OS synchronization API functions, such as WaitForSingleObject on Windows OS.

The PAUSE instruction is intended to:

Temporarily provide the sibling logical processor (ready to make forward progress exiting the spin loop) with competitively shared hardware resources. The competitively-shared microarchitectural resources that the sibling logical processor can utilize in the Skylake microarchitecture are: (1) More front end slots in the Decode ICache, LSD and IDQ; (2) More execution slots in the RS.

Save power consumed by the processor core compared to executing equivalent spin loop instruction sequence in the following configurations: (1) One logical processor is inactive (e.g. entering a C-state); (2) Both logical processors in the same core execute the PAUSE instruction; (3) HT is disabled (e.g. using BIOS options).

The latency of PAUSE instruction in prior generation microarchitecture is about 10 cycles, whereas on Skylake microarchitecture it has been extended to as many as 140 cycles.

The increased latency (allowing more effective utilization of competitively-shared microarchitectural resources to the logical processor ready to make forward progress) has a small positive performance impact of 1-2% on highly threaded applications. It is expected to have negligible impact on less threaded applications if forward progress is not blocked on executing a fixed number of looped PAUSE instructions.

There's also a small power benefit in 2-core and 4-core systems. As the PAUSE latency has been increased significantly, workloads that are sensitive to PAUSE latency will suffer some performance loss.

您可以在《Intel 64 和 IA-32 架构优化引用手册》和《Intel 64 和 IA-32 架构软件开发人员手册》以及代码示例中找到有关此问题的更多信息。

我的意见

最好使程序逻辑的流动方式既不需要 Sleep(0) 也不需要 PAUSE 指令。换句话说，完全避免“旋转等待”循环。相反，请使用高级同步函数，例如 WaitForMultipleObjects()、SetEvent() 等。这种高级同步函数是编写程序的最佳方式。如果您从性能、效率和节能方面分析可用工具(根据您的配置)，则更高级别的功能是最佳选择。尽管它们还遭受昂贵的上下文切换和环 3 到环 0 的转换，但与所有“旋转等待”暂停周期组合或周期的总花费相比，这些费用并不常见，而且非常合理。与 sleep (0)。

在支持超线程的处理器上，“自旋等待”循环可能会消耗处理器执行带宽的很大一部分。执行自旋等待循环的一个逻辑处理器可能会严重影响另一逻辑处理器的性能。这就是为什么有时禁用超线程可能会提高性能，正如一些人指出的那样。

在程序逻辑工作流程中持续轮询设备或文件或状态更改可能会导致计算机消耗更多电量，给内存和总线带来压力，并产生不必要的页面错误(使用 Windows 中的任务管理器来查看哪些页面错误)应用程序在空闲状态下产生大多数页面错误，在后台等待用户输入 - 这些是效率最低的应用程序，因为它们使用上面提到的轮询)。尽可能减少轮询(包括自旋循环)，并使用事件驱动的意识形态和/或框架(如果可用)——这是我强烈推荐的最佳实践。您的应用程序实际上应该一直处于休眠状态，等待预先设置的多个事件。

Nginx 是事件驱动应用程序的一个很好的例子，它最初是为类 UNIX 操作系统编写的。由于操作系统提供了各种功能和方法来通知您的应用程序，因此请使用这些通知而不是轮询设备状态更改。只需让您的程序无限休眠，直到通知到达或用户输入到达。使用这种技术可以减少代码轮询数据源状态的开销，因为当状态发生变化时，代码可以异步获取通知。

关于c++ - 繁忙循环与 sleep (0)和暂停指令有什么不同？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7488196/

26

4

0

文章推荐： arrays - Delphi 7 中使用 JSON 进行数组处理

文章推荐： java - 如何修复这个随机行走程序代码？

文章推荐： database-design - 什么时候不应该使用代理主键？

c - sleep 函数是让所有线程都 sleep 还是只让调用它的线程 sleep ？
我在 linux (Centos) 上使用 pthread 编程？我想让线程休眠一小段时间以等待某些事情。我正在尝试使用 sleep()、nanosleep() 或 usleep() 或其他可以做到这
java - 唤醒 sleep 线程 - interrupt() 与 "splitting" sleep 进入多个 sleep
此要求出现在我的 Android 应用程序中，但它通常适用于 Java。我的应用程序每隔几秒钟“做某事”。我已经按如下方式实现了这一点(只是相关的 fragment - 不是完整的代码): fragm
sleep - 如何在不连续重置的情况下将esp8266从深度 sleep 中唤醒
我正在使用 esp8266 构建 IR 到 WiFi 桥接器。基本上，我正在构建一个连接到红外 Remote 内的 esp8266 的红外接收器，以通过 wifi 将接收到的红外远程按键转发到服务器。
Android让Toast先于Thread.sleep//Systemclock.sleep
我想让 Toast 出现，然后让 sleep 运行。如果我这样做，Toast 会在 sleep 后出现，但我希望反过来。有人有建议吗？这是我的代码 switch (checkedRadioButto
java - 为什么 Thread.sleep() 或 TimeUnit.SECONDS.sleep() 延迟执行先前的语句而不是从调用 sleep() 的地方暂停
我在做一件简单的事情，打印一条错误消息，延迟执行 5 秒，然后调用另一个函数，这是代码 public void saveAndDisplay() throws InterruptedException
Java sleep 命令忽略 sleep 之前的命令
我的 Activity 上有一个按钮，当我单击它时，我希望按钮改变颜色，等待一段时间，然后再次改变颜色。我尝试过以下两个版本的 sleep : 尝试1: public void buClick(Vi
java - Thread.sleep sleep 时间少于指定的时间？
我尝试过这个: for(int i =0; i1。创建ScheduledExecutorService public static ScheduledExecutorService createSch
c# - Thread.Sleep() sleep 时间更长
我有一个 Winform，需要等待大约 3 - 4 小时。我无法关闭并以某种方式重新打开应用程序，因为它在等待时在后台做的事情很少。为了实现等待 - 不给 UI 线程造成麻烦和其他原因 - 我有一个
c - 为什么我的所有线程都使用 sleep() 进行 sleep ？
我在网上看到了下面一段关于 Linux 线程的代码。但是当我运行它时，所有线程似乎都在 sleep ，而不仅仅是主线程。为什么？另外，如果没有 sleep(5)，“线程创建成功”语句会运行 3 次而不
php sleep 与 bash sleep
我有一个 php 脚本，我需要每 5 秒运行一次(运行，等待它完成，等待 5 秒，再次运行) 我有两种方法。要么在脚本中有一个带有 sleep 功能的无限循环，看起来像这样: while (1) {
Java sleep() 在实际 sleep 命令之前停止执行代码
我有一个图形用户界面，我想显示一些文本，然后稍等一下。我的代码看起来像这样: //do something (add JTextArea, revalidate, repaint) try{
android - sleep 模式和深度 sleep 模式有什么区别？
我想知道安卓手机的 sleep 模式和深度 sleep 模式有什么区别，手机进入休眠模式和深度 sleep 模式会有什么影响。请提供详细的答案。问候，皮克斯最佳答案深度 sleep 模式与休眠
c - sleep |函数 `sleep' 的警告隐式声明？
我正在学习 C。在这个节目中我使用 sleep 功能来减慢倒计时。我的教科书没有指定我应该包含的库来使用 sleep 功能。所以我使用它时没有为它包含任何特殊的库并且它可以工作。但它在代码块中给了我这
c - Sleep() 或 sleep() 有多准确
我正在尝试模拟按键按下和按键 Action 。例如:2638 毫秒。 SendMessage(hWnd, WM_KEYDOWN, keyCode, 0); Sleep(2638); SendMess
java - sleep 线程有时 sleep 时间超过所需时间
我在 while 循环中调用一个线程 hibernate 1 秒。当标志为真时，循环将运行(标志为真无限时间)。在循环线程内应该 hibernate 1 秒，唤醒并增加计数器，检查 IF 条件，在 F
perl - 当 sleep() 不能很好地处理警报时，我还能做什么 'sleep'？
有很多文件说“你应该避免使用带警报的 sleep ，因为许多系统使用警报来实现 sleep ”。实际上，我正在为这个问题而苦恼。那么，当 sleep() 不能很好地处理警报时，是否有人可以帮助我“
thread-sleep - Thread.sleep(换图)Java
我有两个带有图像的jlabel..我尝试在单击另一个标签时更改标签中的图像..(例如游戏)..我有一个问题..当我编写 Thread.sleep 时，图像没有改变..请参阅代码: public cla
java - 已 sleep 线程上的 Sleep() 方法
我正在研究多线程，我有一个关于线程 sleep 方法的问题。当我在已经处于 sleep 线程(时间 t2)中执行 sleep()(时间 t1)方法时。总 sleep 时间为 t1+t2 或 t2(如果
c - sleep 系统调用，默认 sleep 时间是多少？
如果我们不向 sleep( ) 函数传递任何参数，默认 sleep 时间是多少？ #include int main() { int pid,dip,cpid; pid = fork(
python - asyncio.sleep() 与 time.sleep()
当我转到 asyncio 页面时，第一个示例是一个 hello world 程序。当我在 python 3.73 上运行它时，我看不出与正常的有什么不同。谁能告诉我区别并举一个重要的例子？ In [

首页

博学

6Ren·AI