Re: LKMM: Making RMW barriers explicit

Jonas Oberhauser <jonas.oberhauser@xxxxxxxxxxxxxxx> · Fri, 24 May 2024 20:09:28 +0200

Am 5/24/2024 um 4:53 PM schrieb Alan Stern:
On Fri, May 24, 2024 at 07:34:06AM -0700, Boqun Feng wrote:
On Fri, May 24, 2024 at 10:14:25AM -0400, Alan Stern wrote:
[...]
Not really, RMW events contains all events generated from
read-modify-write primitives, if there is an R event without a rmw
relation (i.e there is no corresponding write event), it's a failed RWM
by definition: it cannot be anything else.

Not true.  An R event without an rmw relation could be a READ_ONCE().

No, the R event is already in the RWM events, it has come from a rwm
atomic.

Oh, right.  For some reason I was thinking that an instruction could
belong to only one set, R or RMW.  But that doesn't make sense.

I thought the same, so I perhaps contributed to that confusion.

Or a plain read.  The memory model uses the tag to distinguish these
cases.

that it would work is merely an artifact of herd7's internal actions.  I
think it would be much cleaner if herd7 represented these events in some
other way, particularly if we can tell it how.

After all, herd7 already does generate different sets of events for
successful (both R and W) and failed (only R) RMWs.  It's not such a big
stretch to make the R events it generates different in the two cases.

I thought we want to simplify the herd internal processing by avoid
adding mb events in cmpxchg() cases, in the same spirit, if syntactic
tagging is already good enough, why do we want to make herd complicate?

Herd7 already is complicated by the fact that it needs to handle
cmpxchg() instructions in two ways: success and failure.  This
complication is unavoidable.  Adding one extra layer (different tags for
the different ways) is an insignificant increase in the complication,
IMO. And it's a net reduction when you compare it to the amount of
complication currently in the herd7 code.

Also what about cmpxchg_acquire()?  If it fails, it will generate an R
event with an acquire tag not in the rmw relation.

I would like that, but that is not the current implementation, a failed 
atomic_compare_exchange always produces a R*[once]; this behavior is 
currently hardcoded in herd7.

 There is no way to
tell such events apart from a normal smp_load_acquire(), and so the .cat
file would have no way to know that the event should not have acquire
semantics.  I guess we would have to rename this tag, as well.

No,

	let read_of_rmw = (RMW & R)
	let fail_read_of_rmw = read_of_rmw / dom(rmw)
	let fail_cmpxchg_acquire = fail_read_of_rmw & [Acquire]

gives you all the failed cmpxchg_acquire() apart from a normal
smp_load_acquire().

Yes, I see.  Using this approach, no further changes to herd7 or the
def file would be needed.  We would just have to add 'mb to the
Accesses enumeration and to the list of tags allowed for an RMW
instruction.

Question: Since R and RMW have different lists of allowable tags, how
does herd7 decide which tags are allowed for an event that is in both
the R and RMW sets?

After looking over the patch draft for herd7 written by Hernan [1], my 
best guess is: it doen't. It seems that after herd7 detects you are 
using LKMM it simply drops all tags except 'acquire on read and 'release 
on store. Everything else becomes 'once (and 'mb adds some F[mb] sometimes).

That means that if one were to go through with the earlier suggestion to 
distinguish between the smp_mb() Mb and the cmpxchg() Mb, e.g. by 
calling the latter RmwMb, current herd7 would always erase the RmwMb tag 
because it is not called "acquire" or "release".
The same would happen of course if you introduced an RmwAcquire tag.

So there are several options:

- treat the tags as a syntactic thing which is always present, and
 specify their meaning purely in the cat file, analogous to what you
 have defined above. This is personally my favorite solution. To
 implement this in herd7 we would simply remove all the current special
 cases for the LKMM barriers, which I like.

 However then we need to actually define the behavior of herd if
 an inappropriate tag (like release on a load) is provided. Since the
 restriction is actually mostly enforced by the .def file - by simply
 not  providing a smp_store_acquire() etc. -, that only concerns RMWs,
 where xchg_acquire() would apply the Acquire tag to the write also.

 The easiest solution is to simply remove the syntax for specifying
 restrictions - it seems it is not doing much right now anyways - and
 do the filtering inside .bell, with things like

    (* only writes can have Release tags *)
    let Release = Release & W \ (RMW \ rng(rmw))

 One good thing about this way is that it would work even without
 modifying herd, since it is in a sense idempotent with the
 transformations done by herd.

 FWIW, in our internal weak memory model in Dresden we did exactly this,
 and use REL for the syntax and Rel for the semantic release ordering to
 make the distinction more clear, with things like:

    let Acq = (ACQ | SC) & (R | F)
    let Rel = (REL | SC) & (W | F)

 (SC is our equivalent to LKMM's Mb)

- treat the tags as semantic markers that are only present or not under
 some circumstances, and define the semantics fully based on the present
 tags. The circumstances are currently hardcoded in herd7, but we should
 move them out with some syntax like __cmpxchg{acquire}{once}.

 This requires touching the parser and of course still has the issue
 with the acquire tag appearing on the store as well.

- provide a full syntax for defining the event sequence that is
 expected. For example, for a cmpxchg we could do

    cmpxchg = { F[success-cmpxchg]; c = R&RMW[once]; if (c==v) { 
W&RMW[once]; } F[success-cmpxchg] }

 and then define in .bell that a success-cmpxchg is an mb if it is
 directly next to a cmpxchg that succeeds.

 The advantage is that you can customize the representation to whatever
 kernel devs thing is the most intuitive. The downside is that it
 requires major rewrites to herd7, someone to think about a reasonable
 language for specifying this etc.

Best wishes,
  jonas

[1] https://github.com/herd/herdtools7/pull/865