Re: [RFC PATCH] LKMM: Add ctrl_dep() macro for control dependency

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Thu, 14 Oct 2021 09:23:11 -0700

On Thu, Oct 14, 2021 at 05:58:16PM +0200, Florian Weimer wrote:
> * Paul E. McKenney:
> 
> > On Sun, Oct 10, 2021 at 04:02:02PM +0200, Florian Weimer wrote:
> >> * Linus Torvalds:
> >> 
> >> > On Fri, Oct 1, 2021 at 9:26 AM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
> >> >>
> >> >> Will any conditional branch do, or is it necessary that it depends in
> >> >> some way on the data read?
> >> >
> >> > The condition needs to be dependent on the read.
> >> >
> >> > (Easy way to see it: if the read isn't related to the conditional or
> >> > write data/address, the read could just be delayed to after the
> >> > condition and the store had been done).
> >> 
> >> That entirely depends on how the hardware is specified to work.  And
> >> the hardware could recognize certain patterns as always producing the
> >> same condition codes, e.g., AND with zero.  Do such tests still count?
> >> It depends on what the specification says.
> >> 
> >> What I really dislike about this: Operators like & and < now have side
> >> effects, and is no longer possible to reason about arithmetic
> >> expressions in isolation.
> >
> > Is there a reasonable syntax that might help with these issues?
> 
> Is this really a problem of syntax?

No, but we seem to need some way to communicate the control-dependency's
ordering intent to the compiler.  ;-)

> > Yes, I know, we for sure have conflicting constraints on "reasonable"
> > on copy on this email.  What else is new?  ;-)
> >
> > I could imagine a tag of some sort on the load and store, linking the
> > operations that needed to be ordered.  You would also want that same
> > tag on any conditional operators along the way?  Or would the presence
> > of the tags on the load and store suffice?
> 
> If the load is assigned to a local variable whose address is not taken
> and which is only assigned this once, it could be used to label the
> store.  Then the compiler checks if all paths from the load to the
> store feature a condition that depends on the local variable (where
> qualifying conditions probably depend on the architecture).  If it
> can't prove that is the case, it emits a fake no-op condition that
> triggers the hardware barrier.  This formulation has the advantage
> that it does not add side effects to operators like <.  It even
> generalizes to different barrier-implying instructions besides
> conditional branches.

So something like this?

	tagvar = READ_ONCE(a);
	if (tagvar)
		WRITE_ONCE_COND(b, 1, tagvar);

(This seems to me to be an eminently reasonable syntax.)

Or did I miss a turn in there somewhere?

> But I'm not sure if all this complexity will be a tangible improvement
> over just using that no-op condition all the time (whether implied by
> READ_ONCE, or in a separate ctrl_dep macro).

That is an excellent question.  I have no idea what the answer is.  ;-)

						Thanx, Paul