On Wed, May 15, 2024 at 08:44:30AM +0200, Hernan Ponce de Leon wrote: > On 5/6/2024 8:00 PM, Paul E. McKenney wrote: > > On Mon, May 06, 2024 at 06:30:45PM +0200, Jonas Oberhauser wrote: > > > Am 5/6/2024 um 12:05 PM schrieb Jonas Oberhauser: > > > > Am 5/2/2024 um 1:21 AM schrieb Paul E. McKenney: > > > > > This commit adds four litmus tests showing that a failing cmpxchg() > > > > > operation is unordered unless followed by an smp_mb__after_atomic() > > > > > operation. > > > > > > > > So far, my understanding was that all RMW operations without suffix > > > > (xchg(), cmpxchg(), ...) will be interpreted as F[Mb];...;F[Mb]. > > > > > > > > I guess this shows again how important it is to model these full > > > > barriers explicitly inside the cat model, instead of relying on implicit > > > > conversions internal to herd. > > > > > > > > I'd like to propose a patch to this effect. > > > > > > > > What is the intended behavior of a failed cmpxchg()? Is it the same as a > > > > relaxed one? > > > > Yes, and unless I am too confused, LKMM currently does implement this. > > Please let me know if I am missing something. > > > > > > My suggestion would be in the direction of marking read and write events > > > > of these operations as Mb, and then defining > > > > > > > > (* full barrier events that appear in non-failing RMW *) > > > > let RMW_MB = Mb & (dom(rmw) | range(rmw)) > > > > > > > > > > > > let mb = > > > > [M] ; fencerel(Mb) ; [M] > > > > | [M] ; (po \ rmw) ; [RMW_MB] ; po^? ; [M] > > > > | [M] ; po^? ; [RMW_MB] ; (po \ rmw) ; [M] > > > > | ... > > > > > > > > The po \ rmw is because ordering is not provided internally of the rmw > > > > > > (removed the unnecessary si since LKMM is still non-mixed-accesses) > > > > Addition of mixed-access support would be quite welcome! > > > > > This could also be written with a single rule: > > > > > > | [M] ; (po \ rmw) & (po^?; [RMW_MB] ; po^?) ; [M] > > > > > > > I suspect that after we added [rmw] sequences it could perhaps be > > > > simplified [...] > > > > > > No, my suspicion is wrong - this would incorrectly let full-barrier RMWs > > > act like strong fences when they appear in an rmw sequence. > > > > > > if (z==1) || x = 2; || xchg(&y,2) || if (y==2) > > > x = 1; || y =_rel 1; || || z=1; > > > > > > > > > right now, we allow x=2 overwriting x=1 (in case the last thread does not > > > propagate x=2 along with z=1) because on power, the xchg might be > > > implemented with a sync that doesn't get executed until the very end > > > of the program run. > > > > > > > > > Instead of its negative form (everything other than inside the rmw), > > > it could also be rewritten positively. Here's a somewhat short form: > > > > > > let mb = > > > [M] ; fencerel(Mb) ; [M] > > > (* everything across a full barrier RMW is ordered. This includes up to > > > one event inside the RMW. *) > > > | [M] ; po ; [RMW_MB] ; po ; [M] > > > (* full barrier RMW writes are ordered with everything behind the RMW *) > > > | [W & RMW_MB] ; po ; [M] > > > (* full barrier RMW reads are ordered with everything before the RMW *) > > > | [M] ; po ; [R & RMW_MB] > > > | ... > > > > Does this produce the results expected by the litmus tests in the Linux > > kernel source tree and also those at https://github.com/paulmckrcu/litmus? > > > > Thanx, Paul > > I implemented in the dartagnan tool the changes proposed by Jonas (i.e. > changing the mb definition in the cat model and removing the fences that > were added programmatically). > > I run this using the ~5K litmus test I have (it should include everything > from the source tree + the non-LISA ones from your repo). I also checked > with the version of qspinlock discussed in [1]. > > I do get the expected results. Thank you very much, Hernan!!! And good show, Puranjay!!! Thanx, Paul > Hernan > > [1] https://lkml.org/lkml/2022/8/26/597 >