c - glibc/NPTL/Linux 稳健互斥锁中的竞争条件？

在对问题 Automatically release mutex on crashes in Unix 的评论中早在 2010 年，吉尔斯就声称:

glibc's robust mutexes are so fast because glibc takes dangerous shortcuts. There is no guarantee that the mutex still exists when the kernel marks it as "will cause EOWNERDEAD". If the mutex was destroyed and the memory replaced by a memory mapped file that happens to contain the last owning thread's ID at the right place and the last owning thread terminates just after writing the lock word (but before fully removing the mutex from its list of owned mutexes), the file is corrupted. Solaris and will-be-FreeBSD9 robust mutexes are slower because they do not want to take this risk.

我无法理解这种说法，因为销毁互斥锁是不合法的，除非它被解锁(因此不在任何线程的健壮列表中)。我也找不到任何搜索此类错误/问题的引用资料。声明只是错误的吗？

我问并且感兴趣的原因是，这与我自己基于相同 Linux 健壮互斥原语构建的实现的正确性有关。

最佳答案

我觉得我找到了比赛，确实很丑。它是这样的:

线程 A 持有强大的互斥体并解锁它。基本流程是:

将它放在线程的健壮列表头的“待定”槽中。
将其从当前线程持有的稳健互斥量链表中移除。
解锁互斥量。
清除线程的健壮列表标题的“待定”槽。

问题是在第 3 步和第 4 步之间，同一进程中的另一个线程可以获得互斥量，然后解锁它，并且(正确地)相信自己是互斥量的最终用户，销毁并释放/映射它。在那之后，如果进程中的任何线程创建文件、设备或共享内存的共享映射，并且它恰好被分配了相同的地址，并且该位置的值恰好与仍在步骤之间的线程的 pid 匹配解锁的 3 和 4，你有一种情况，如果进程被终止，内核将通过设置它认为是互斥锁所有者 ID 的 32 位整数的高位来破坏映射文件。

解决方案是在上面的第 2 步和第 4 步之间对 mmap/munmap 进行全局锁定，这与我对这个问题的回答中描述的屏障问题的解决方案完全相同:

Can a correct fail-safe process-shared barrier be implemented on Linux?

关于c - glibc/NPTL/Linux 稳健互斥锁中的竞争条件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11945429/

c - glibc/NPTL/Linux 稳健互斥锁中的竞争条件？

上一篇：linux - 何时在 nm 中使用 --dynamic 选项

下一篇：linux - 如何开发linux屏保