gpt4 book ai didi

c++ - "acquire"和 "consume"内存顺序有何不同, "consume"何时更可取?

转载 作者:IT老高 更新时间:2023-10-28 13:22:25 24 4
gpt4 key购买 nike

C++11 标准定义了一个内存模型(1.7、1.10),其中包含内存排序,大致为“顺序一致”、“获取”、“消耗”、“释放”和“放松”。同样粗略地,一个程序只有在它是无种族的情况下才是正确的,如果所有 Action 都可以按某种顺序排列,其中一个 Action 发生在另一个 Action 之前,就会发生这种情况。一个 Action X发生-before Action Y的方式是XY之前排序(在一个线程),或 X 线程间发生在 Y 之前。除其他外,当

  • XY 同步,或
  • XY 之前按依赖顺序排序。

Synchronizing-with 发生在 X 是一个原子存储,在某个原子变量上具有“释放”顺序,而 Y 是一个原子负载对同一变量进行“获取”排序。 dependency-ordered-before 发生在 Y 加载“消费”排序(以及合适的内存访问)的类似情况下。 synchronizes-with 的概念扩展了 happens-before 关系,并在线程中相互sequenced-before 的 Action 之间传递,但在 >dependency-ordered-before 仅通过称为 carries-dependencysequenced-before 的严格子集进行传递性扩展,该子集遵循较大的规则集,并且尤其可以被 std::kill_dependency 打断。

那么,“依赖排序”概念的目的是什么?与更简单的 sequenced-before/synchronizes-with 排序相比,它有什么优势?由于它的规则更严格,我认为可以更有效地实现。

您能否举例说明从发布/获取到发布/使用的切换既正确又提供了重要优势的程序? std::kill_dependency 什么时候会提供改进?高层次的论点会很好,但硬件特定的差异会加分。

最佳答案

N2492 引入了数据依赖排序理由如下:

There are two significant use cases where the current working draft (N2461) does not support scalability near that possible on some existing hardware.

  • read access to rarely written concurrent data structures

Rarely written concurrent data structures are quite common, both in operating-system kernels and in server-style applications. Examples include data structures representing outside state (such as routing tables), software configuration (modules currently loaded), hardware configuration (storage device currently in use), and security policies (access control permissions, firewall rules). Read-to-write ratios well in excess of a billion to one are quite common.

  • publish-subscribe semantics for pointer-mediated publication

Much communication between threads is pointer-mediated, in which the producer publishes a pointer through which the consumer can access information. Access to that data is possible without full acquire semantics.

In such cases, use of inter-thread data-dependency ordering has resulted in order-of-magnitude speedups and similar improvements in scalability on machines that support inter-thread data-dependency ordering. Such speedups are possible because such machines can avoid the expensive lock acquisitions, atomic instructions, or memory fences that are otherwise required.

强调我的

这里展示的激励用例是来自 Linux 内核的 rcu_dereference()

关于c++ - "acquire"和 "consume"内存顺序有何不同, "consume"何时更可取?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19609964/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com