我有一个关于 GCC-Wiki article 的问题.在标题“Overall Summary”下,给出了以下代码示例:
线程 1:
y.store (20);
x.store (10);
线程 2:
if (x.load() == 10) {
assert (y.load() == 20)
y.store (10)
}
据说,如果所有的store都是release并且所有的load都是acquire,线程2中的assert就不会失败。这对我来说很清楚(因为线程 1 中对 x 的存储与线程 2 中 x 的加载同步)。
但是现在是我不明白的部分。也就是说,如果所有的store都是release,所有的load都是consum,结果是一样的。来自 y 的负载是否可能在来自 x 的负载之前被提升(因为这些变量之间没有依赖关系)?这意味着线程 2 中的断言实际上可能会失败。
最佳答案
C11标准的规定如下。
5.1.2.4 多线程执行和数据竞争
An evaluation A is dependency-ordered before 16) an evaluation B if:
— A performs a release operation on an atomic object M, and, in another thread, B performs a consume operation on M and reads a value written by any side effect in the release sequence headed by A, or
— for some evaluation X, A is dependency-ordered before X and X carries a dependency to B.
An evaluation A inter-thread happens before an evaluation B if A synchronizes with B, A is dependency-ordered before B, or, for some evaluation X:
— A synchronizes with X and X is sequenced before B,
— A is sequenced before X and X inter-thread happens before B, or
— A inter-thread happens before X and X inter-thread happens before B.
NOTE 7 The ‘‘inter-thread happens before’’ relation describes arbitrary concatenations of ‘‘sequenced before’’, ‘‘synchronizes with’’, and ‘‘dependency-ordered before’’ relationships, with two exceptions. The first exception is that a concatenation is not permitted to end with ‘‘dependency-ordered before’’ followed by ‘‘sequenced before’’. The reason for this limitation is that a consume operation participating in a ‘‘dependency-ordered before’’ relationship provides ordering only with respect to operations to which this consume operation actually carries a dependency. The reason that this limitation applies only to the end of such a concatenation is that any subsequent release operation will provide the required ordering for a prior consume operation. The second exception is that a concatenation is not permitted to consist entirely of ‘‘sequenced before’’. The reasons for this limitation are (1) to permit ‘‘inter-thread happens before’’ to be transitively closed and (2) the ‘‘happens before’’ relation, defined below, provides for relationships consisting entirely of ‘‘sequenced before’’.
An evaluation A happens before an evaluation B if A is sequenced before B or A inter-thread happens before B.
A visible side effect A on an object M with respect to a value computation B of M satisfies the conditions:
— A happens before B, and
— there is no other side effect X to M such that A happens before X and X happens before B.
The value of a non-atomic scalar object M, as determined by evaluation B, shall be the value stored by the visible side effect A.
(强调)
在下面的评论中,我将缩写如下:
- 依赖顺序在: DOB
- 线程间发生在: ITHB
- 发生在: HB
- 测序前: SeqB
让我们回顾一下这是如何应用的。我们有 4 个相关的内存操作,我们将其命名为评估 A、B、C 和 D:
线程 1:
y.store (20); // Release; Evaluation A
x.store (10); // Release; Evaluation B
线程 2:
if (x.load() == 10) { // Consume; Evaluation C
assert (y.load() == 20) // Consume; Evaluation D
y.store (10)
}
为了证明断言永远不会失败,我们实际上试图证明 A 始终是 D 处的可见副作用。根据5.1.2.4 (15),我们有:
A SeqB B DOB C SeqB D
这是一个以 DOB 结尾的连接,后跟 SeqB。这是由 (17) 明确规定的不是 ITHB 级联,尽管 (16) 怎么说。
我们知道,由于A和D不在同一个执行线程,所以A不是SeqB D;因此式(18)中HB的两个条件都不满足,A不满足HB D。
因此 A 对 D 不可见,因为不满足 (19) 的条件之一。断言可能会失败。
然后描述了这是如何进行的 here, in the C++ standard's memory model discussion和 here, Section 4.2 Control Dependencies :
- (提前一段时间)线程 2 的分支预测器猜测
if
将被采用。 - 线程 2 接近预测采用的分支并开始推测性提取。
- 线程 2 乱序并推测性地从
y
加载0xGUNK
(评估 D)。 (也许它还没有从缓存中清除?)。 - 线程 1 将
20
存储到y
(评估 A) - 线程 1 将
10
存储到x
(计算 B) - 线程 2 从
x
加载10
(评估 C) - 线程 2 确认
if
被采用。 - 线程 2 的
y == 0xGUNK
的推测负载已提交。 - 线程 2 断言失败。
允许评估 D 在 C 之前重新排序的原因是 consume 不 禁止它。这与 acquire-load 不同,它可以防止按程序顺序在它之后的任何加载/存储在它之前重新排序。同样,5.1.2.4(15) 指出,参与“之前依赖排序”关系的消费操作仅提供关于此消费操作实际携带依赖的操作的排序,并且两个负载之间绝对没有依赖关系。
CppMem验证
CppMem是一个有助于探索 C11 和 C++11 内存模型下的共享数据访问场景的工具。
对于以下近似问题场景的代码:
int main() {
atomic_int x, y;
y.store(30, mo_seq_cst);
{{{ { y.store(20, mo_release);
x.store(10, mo_release); }
||| { r3 = x.load(mo_consume).readsvalue(10);
r4 = y.load(mo_consume); }
}}};
return 0; }
该工具报告两个一致的、无竞争的场景,即:
其中y=20
被成功读取,并且
其中读取了“陈旧”初始化值 y=30
。徒手圈是我的。
相比之下,当 mo_acquire
用于加载时,CppMem 仅报告一个一致的、无竞争的场景,即正确的场景:
其中 y=20
被读取。
关于c++ - memory_order_consume 和 memory_order_acquire 的区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31993500/