On Fri, Jun 4, 2021 at 8:14 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > then I could in theory see teh compiler doing that WRITE_ONCE() as > > some kind of non-control dependency. > > This may be a minor point, but can that loophole be closed as follows? Note that it's actually entirely sufficient to have the barrier just on one side. I brought it up mainly as an oddity, and that it can result in the compiler generating different code for the two different directions. The reason that it is sufficient is that with the barrier in place (on either side), the compiler really can't do much. It can't join either of the sides, because it has to do that barrier on one side before any common code. In fact, even if the compiler decides to first do a conditional call just around the barrier, and then do any common code (and then do _another_ conditional branch), it still did that conditional branch first, and the problem is solved. The CPU doesn't care, it will have to resolve the branch before any subsequent stores are finalized. Of course, if the compiler creates a conditional call just around the barrier, and the barrier is empty (like we do now), and the compiler leaves no mark of it in the result (like it does seem to do for empty asm stataments), I could imagine some optimizing assembler (or linker) screwing things up for us, and saying "a conditional branch to the next instruction can just be removed). At that point, we've lost again, and it's a toolchain issue. I don't think that issue can currently happen, but it's an example of yet another really subtle problem that *could* happen even if *we* do everything right. I also do not believe that any of our code that has this pattern would have that situation where the compiler would generate a branch over just the barrier. It's kind of similar to Paul's example in that sense. When we use volatile_if(), the two sides are very very different entirely regardless of the barrier, so in practice I think this is all entirely moot. Linus