On Fri, Sep 27, 2019 at 11:51:07AM +0200, Andrea Parri wrote: > For the record, the LKMM doesn't currently model "order" derived from > control dependencies to a _plain_ access (even if the plain access is > a write): in particular, the following is racy (as far as the current > LKMM is concerned): > > C rb > > { } > > P0(int *tail, int *data, int *head) > { > if (READ_ONCE(*tail)) { > *data = 1; > smp_wmb(); > WRITE_ONCE(*head, 1); > } > } > > P1(int *tail, int *data, int *head) > { > int r0; > int r1; > > r0 = READ_ONCE(*head); > smp_rmb(); > r1 = *data; > smp_mb(); > WRITE_ONCE(*tail, 1); > } > > Replacing the plain "*data = 1" with "WRITE_ONCE(*data, 1)" (or doing > s/READ_ONCE(*tail)/smp_load_acquire(tail)) suffices to avoid the race. > Maybe I'm short of imagination this morning... but I can't currently > see how the compiler could "break" the above scenario. The compiler; if sufficiently smart; is 'allowed' to change P0 into something terrible like: *data = 1; if (*tail) { smp_wmb(); *head = 1; } else *data = 0; (assuming it knows *data was 0 from a prior store or something) Using WRITE_ONCE() defeats this because volatile indicates external visibility. > I also didn't spend much time thinking about it. memory-barriers.txt > has a section "CONTROL DEPENDENCIES" dedicated to "alerting developers > using control dependencies for ordering". That's quite a long section > (and probably still incomplete); the last paragraph summarizes: ;-) Barring LTO the above works for perf because of inter-translation-unit function calls, which imply a compiler barrier. Now, when the compiler inlines, it looses that sync point (and thereby subtlely changes semantics from the non-inline variant). I suspect LTO does the same and can cause subtle breakage through this transformation. > (*) Compilers do not understand control dependencies. It is therefore > your job to ensure that they do not break your code. It is one the list of things I want to talk about when I finally get relevant GCC and LLVM people in the same room ;-) Ideally the compiler can be taught to recognise conditionals dependent on 'volatile' loads and disallow problematic transformations around them. I've added Nick (clang) and Jose (GCC) on Cc, hopefully they can help find the right people for us.