Re: Some observations (results) on BPF acquire and release

Andrea Parri <parri.andrea@xxxxxxxxx> · Fri, 25 Oct 2024 16:56:17 +0300

> But the subset of the LKMM which deals with "strong fences" and Acq &
> Rel (limited to so called marked accesses) seems relatively contained
> /simple:  its analysis could be useful, if not determining, in trying
> to resolve the above issues.

Elaborating on the previous suggestion/comparison with the LKMM, the
"subset" in question can take the following form (modulo my typos):

"LKMM with acquire/release, strong fences, and marked accesses only"

[...]

let acq-po = [Acq] ; po ; [M]
let po-rel = [M] ; po ; [Rel]
let strong-fence = [M] ; fencerel(Mb) ; [M]
let ppo = acq-po | po-rel | strong-fence

let A-cumul(r) = rfe? ; r
let cumul-fence = A-cumul(strong-fence | po-rel)
let overwrite = co | fr
let prop = (overwrite & ext)? ; cumul-fence* ; rfe?

let hb = ppo | rfe | ((prop \ id) & int)
acyclic hb as Hb

let pb = prop ; strong-fence ; hb*
acyclic pb as Pb

For BPF, we'd want to replace acq-po, po-rel and strong-fence with
load_acquire, store_release and po_amo_fetch respectively:  Unless
I'm missing something, this should restore the intended behaviors
for the R and Z6.3 tests discussed earlier.

A couple of other remarks:

- Notice how the above formalization is completely symmetrical wrt.
  co <-> fr, IOW, co links are considered "on par with" fr links.
  In particular, the following test is satisfiable in the above
  formalization, as is the corresponding C test in the LKMM:

BPF 2+2W+release+fence
{
 0:r2=x; 0:r4=y;
 1:r2=y; 1:r4=x; 1:r6=l;
}
 P0                                 | P1                                         ;
 r1 = 1                             | r1 = 1                                     ;
 *(u32 *)(r2 + 0) = r1              | *(u32 *)(r2 + 0) = r1                      ;
 r3 = 2                             | r5 = atomic_fetch_add((u32 *)(r6 + 0), r5) ;
 store_release((u32 *)(r4 + 0), r3) | r3 = 2                                     ;
                                    | store_release((u32 *)(r4 + 0), r3)         ;
exists ([x]=1 /\ [y]=1)

  (On an historical note, this wasn't always the case in the LKMM,
  cf. e.g. [1], but be alerted that the formalization in [1] is
  decisively more involved and less intuitive than today's / what
  the LKMM community has converged to.  ;-) )

- The above formalization merges the so called "Observation" axiom
  in the "Happens-before" axiom.  In the LKMM, this followed the
  removal of B-cumulativity for smp_wmb() and smp_store_release()
  and a consequent "great simplification" of the hb relation: link
  [2] can provide more details and some examples related to those
  changes.  For completeness, here is the BPF analogue of test
  "C-release-acquire-is-B-cumulative.litmus" from that article:

BPF ISA2+release+acquire+acquire
{
 0:r2=x; 0:r4=y;
 1:r2=y; 1:r4=z;
 2:r2=z; 2:r4=x;
}
 P0                                 | P1                                 | P2                                 ;
 r1 = 1                             | r1 = load_acquire((u32 *)(r2 + 0)) | r1 = load_acquire((u32 *)(r2 + 0)) ;
 *(u32 *)(r2 + 0) = r1              | r3 = 1                             | r3 = *(u32 *)(r4 + 0)              ;
 r3 = 1                             | *(u32 *)(r4 + 0) = r3              |                                    ;
 store_release((u32 *)(r4 + 0), r3) |                                    |                                    ;
exists (1:r1=1 /\ 2:r1=1 /\ 2:r3=0)

  The formalization sketched above allows this behavior.  Notice,
  however, that the behavior is forbidden after "completion" of
  the release/acquire chain, i.e. by making the store from P1 a
  store-release (a property also known as A-cumulativy of the
  release operation).

I guess the next question (once clarified the intentions for the R
and Z6.3 tests seen earlier) is "Does BPF really care about 2+2W
and B-cumulativity for store-release?"; I mentioned some tradeoff,
but in the end this is a call for the BPF community.

  Andrea

[1] https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/StrongModel.html
[2] https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/WeakModel.html#Cumulativity