Re: [RFC] LKMM: Add volatile_if()

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Sun, 6 Jun 2021 11:43:42 -0700

On Sun, Jun 6, 2021 at 11:22 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> To be fair, the same argument applies even without the asm code.  The
> compiler will translate

Yes, yes.

But that is literally why the asm exists in the first place.

It's supposed to be the barrier that makes sure that doesn't happen.

So your point that "but this would happen without the asm" is missing
the whole point. This is exactly the thing that the asm is supposed to
avoid.

And it actually works fine when just one side has the barrier, because
then no merging can take place, because there is nothing to merge.

That's why my suggested fix for "volatile_if()" was this #define

    #define barrier_true() ({ barrier(); 1; })
    #define volatile_if(x) if ((x) && barrier_true())

because now code like

    volatile_if (READ_ONCE(a))
        WRITE_ONCE(b, 1);
    else
        WRITE_ONCE(b, 1);

would force that branch. And it's actually fine to merge the
"WRITE(b,1)", as loing as the branch exists, so the above can (and
does) compile to

    LD A
    BEQ over
    "empty asm"
over:
    ST $1,B

and the above is actually perfectly valid code and actually solves the
problem, even if it admittedly looks entirely insane.

With that crazy "conditional jump over nothing" the store to B is
ordered wrt the load from A on real machines.

And again: I do not believe we actually have this kind of code in the
kernel. I could imagine some CPU turning "conditional branch over
nothing" into a nop-op internally, and losing the ordering. And that's
ok, exactly because the above kind of code that *only* does the
WRITE_ONCE() and nothing else is crazy and stupid.

So don't get hung up on the "branch over nothing", that's just for
this insane unreal example.

But I *could* see us having something where both branches do end up
writing to "B", and it might even be the first thing both branches end
up doing. Not the *only* thing they do, but "B" might be a flag for "I
am actively working on this issue", and I could see a situation where
we care that the read of "A" (which might be what specifies *what* the
issue is) would need to be ordered with regards to that "I'm working
on it" flag.

IOW, another CPU might want to know *what* somebody is working on, and do

    /* Is somebody working on this */
    if (READ_ONCE(B)) {
        smp_rmb();
        READ_ONCE(A); <- this is what they are working on

and the ordering requirement in this all is that B has to be written
after A has been read.

So while the example code is insane and pointless (and you shouldn't
read *too* much into it), conceptually the notion of that pattern of

    if (READ_ONCE(a)) {
        WRITE_ONCE(b,1);
        .. do something ..
    } else {
        WRITE_ONCE(b,1);
        .. do something else ..
    }

is not insane or entirely unrealistic - the WRITE_ONCE() might
basically be an ACK for "I have read the value of A and will act on
it".

Odd? Yes. Unusual? Yes. Do we do this now? No. But it does worry me
that we don't seem to have a good way to add that required barrier.

Adding it on one side is good, and works, but what if somebody then does

    volatile_if (READ_ONCE(a))
        WRITE_ONCE(b, 1);
    else {
        barrier();
        WRITE_ONCE(b, 1);
    }

and now we end up with it on both sides again, and then the second
barrier basically undoes the first one..

              Linus