On Thu, Nov 21, 2013 at 2:52 PM, Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > > Actually, the weakest forms of locking only guarantee a consistent view > of memory if you are actually holding the lock. Not "a" lock, but "the" > lock. I don't think we necessarily support any architecture that does that, though. And afaik, it's almost impossible to actually do sanely in hardware with any sane cache coherency, so.. So realistically, I think we only really need to worry about memory ordering that is tied to cache coherency protocols, where even locking rules tend to be about memory ordering (although extended rules like acquire/release rather than the broken pure barrier model). Do you know any actual architecture where this isn't the case? > So the three fixes I know of at the moment are: > > 1. Upgrade smp_store_release()'s PPC implementation from lwsync > to sync. > > What about ARM? ARM platforms that have the load-acquire and > store-release instructions could use them, but other ARM > platforms have to use dmb. ARM avoids PPC's lwsync issue > because it has no equivalent to lwsync. > > 2. Place an explicit smp_mb() into the MCS-lock queued handoff > code. > > 3. Remove the requirement that "unlock+lock" be a full memory > barrier. > > We have been leaning towards #1, but before making any hard decision > on this we are looking more closely at what the situation is on other > architectures. So I might be inclined to lean towards #1 simply because of test coverage. We have no sane test coverage of weakly ordered models. Sure, ARM may be weakly ordered (with saner acquire/release in ARM64), but realistically, no existing ARM platforms actually gives us any reasonable test *coverage* for things like this, despite having tons of chips out there running Linux. Very few people debug problems in that world. The PPC people probably have much better testing and are more likely to figure out the bugs, but don't have the pure number of machines. So x86 tends to still remain the main platform where serious testing gets done. That said, I'd still be perfectly happy with #3, since - unlike, say, the PCI ordering issues with drivers - at least people *can* try to think about this somewhat analytically, even if it's ripe for confusion and subtle mistakes. And I still think you got the ordering wrong, and should be talking about "lock+unlock" rather than "unlock+lock". Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>