On Mon, Jun 13, 2022 at 12:27:44PM +0000, Paul Heidekrüger wrote: > As discussed, clarify LKMM not recognizing certain kinds of orderings. > In particular, highlight the fact that LKMM might deliberately make > weaker guarantees than compilers and architectures. > > Link: https://lore.kernel.org/all/YpoW1deb%2FQeeszO1@xxxxxxxxxxxxxxxxxxxxxxxx/T/#u > Signed-off-by: Paul Heidekrüger <paul.heidekrueger@xxxxxxxxx> > Cc: Marco Elver <elver@xxxxxxxxxx> > Cc: Charalampos Mainas <charalampos.mainas@xxxxxxxxx> > Cc: Pramod Bhatotia <pramod.bhatotia@xxxxxxxxx> > Cc: Soham Chakraborty <s.s.chakraborty@xxxxxxxxxx> > Cc: Martin Fink <martin.fink@xxxxxxxxx> > --- > .../Documentation/litmus-tests.txt | 29 ++++++++++++------- > 1 file changed, 19 insertions(+), 10 deletions(-) > > diff --git a/tools/memory-model/Documentation/litmus-tests.txt b/tools/memory-model/Documentation/litmus-tests.txt > index 8a9d5d2787f9..623059eff84e 100644 > --- a/tools/memory-model/Documentation/litmus-tests.txt > +++ b/tools/memory-model/Documentation/litmus-tests.txt > @@ -946,22 +946,31 @@ Limitations of the Linux-kernel memory model (LKMM) include: > carrying a dependency, then the compiler can break that dependency > by substituting a constant of that value. > > - Conversely, LKMM sometimes doesn't recognize that a particular > - optimization is not allowed, and as a result, thinks that a > - dependency is not present (because the optimization would break it). > - The memory model misses some pretty obvious control dependencies > - because of this limitation. A simple example is: > + Conversely, LKMM will sometimes overstate the amount of reordering > + done by architectures and compilers, leading it to missing some > + pretty obvious orderings. A simple example is: I don't like the word "overstate" here. How about instead: LKMM will sometimes overestimate the amount of reordering CPUs and compilers can carry out, leading it to miss some pretty obvious cases of ordering. > > r1 = READ_ONCE(x); > if (r1 == 0) > smp_mb(); > WRITE_ONCE(y, 1); > > - There is a control dependency from the READ_ONCE to the WRITE_ONCE, > - even when r1 is nonzero, but LKMM doesn't realize this and thinks > - that the write may execute before the read if r1 != 0. (Yes, that > - doesn't make sense if you think about it, but the memory model's > - intelligence is limited.) > + There is no dependency from the WRITE_ONCE() to the READ_ONCE(), You mean "from the READ_ONCE() to the WRITE_ONCE()". > + and as a result, LKMM does not assume ordering. However, the ... does not claim that the load is ordered before the store. > + smp_mb() in the if branch will prevent architectures from > + reordering the WRITE_ONCE() ahead of the READ_ONCE() but only if r1 Architectures don't do reordering; CPUs do. In any case this sentence is wrong; the presence of the "if" statement is what prevents the reordering. CPUs will never reorder a store before a conditional branch, even if the store gets executed on both branches of the conditional. By contrast, the smp_mb() in one of the branches prevents _compilers_ from moving the store before the conditional. > + is 0. This, by definition, is not a control dependency, yet > + ordering is guaranteed in some cases, depending on the READ_ONCE(), > + which LKMM doesn't recognize. Say instead: However, even though no dependency is present, the WRITE_ONCE() will not be executed before the READ_ONCE(). There are two reasons for this: The presence of the smp_mb() in one of the branches prevents the compiler from moving the WRITE_ONCE() up before the "if" statement, since the compiler has to assume that r1 will sometimes be 0 (but see the comment below); CPUs do not execute stores before po-earlier conditional branches, even in cases where the store occurs after the two arms of the branch have recombined. > + > + It is clear that it is not dangerous in the slightest for LKMM to > + make weaker guarantees than architectures. In fact, it is > + desirable, as it gives compilers room for making optimizations. > + For instance, because a value of 0 triggers undefined behavior "because a value of 0 triggers undefined behavior" implies that undefined behavior will always occur. Instead say: For instance, suppose that a 0 value in r1 would trigger undefined behavior later on. Then a clever compiler... > + elsewhere, a clever compiler might deduce that r1 can never be 0 in > + the if condition. As a result, said clever compiler might deem it > + safe to optimize away the smp_mb(), eliminating the branch and > + any ordering an architecture would guarantee otherwise. Alan > > 2. Multiple access sizes for a single variable are not supported, > and neither are misaligned or partially overlapping accesses. > -- > 2.35.1 >