Re: [RFC] LKMM: Add volatile_if()

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 7 Jun 2021 13:31:24 -0700

On Mon, Jun 7, 2021 at 10:45 AM Segher Boessenkool
<segher@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sun, Jun 06, 2021 at 03:38:06PM -0700, Linus Torvalds wrote:
> >
> > Example: variable_test_bit(), which generates a "bt" instruction, does
> >
> >                      : "m" (*(unsigned long *)addr), "Ir" (nr) : "memory");
> >
> > and the memory clobber is obviously wrong: 'bt' only *reads* memory,
> > but since the whole reason we use it is that it's not just that word
> > at address 'addr', in order to make sure that any previous writes are
> > actually stable in memory, we use that "memory" clobber.
>
> You can split the "I" version from the "r" version, it does not need
> the memory clobber.  If you know the actual maximum bit offset used you
> don't need the clobber for "r" either.  Or you could even write
>   "m"(((unsigned long *)addr)[nr/32])
> That should work for all cases.

Note that the bit test thing really was just an example.

And some other cases don't actually have an address range at all,
because they affect arbitrary ranges, not - like that bit test - just
one particular range.

To pick a couple of examples of that, think of

 (a) write memory barrier. On some architectures it's an explicit
instruction, on x86 it's just a compiler barrier, since writes are
ordered on the CPU anyway, and we only need to make sure that the
compiler doesn't re-order writes around the barrier

Again, we currently use that same "barrier()" macro for that:

    #define __smp_wmb()     barrier()

but as mentioned, the barrier() thing has a "memory" clobber, and that
means that this write barrier - which is really really cheap on x86 -
also unnecessarily ends up causing pointless reloads from globals. It
obviously doesn't actually *change* memory, but it very much requires
that writes are not moved around it.

 (b) things like cache flush and/or invalidate instructions, eg

        asm volatile("wbinvd": : :"memory");

Again, this one doesn't actually *modify* memory, and honestly, this
one is not performance critical so the memory clobber is not actually
a problem, but I'm pointing it out as an example of the exact same
issue: the notion of an instruction that we don't want _writes_ to
move around, but reads can happily be moved and/or cached around it.

 (c) this whole "volatile_if()" situation: we want to make sure writes
can't move around it, but there's no reason to re-load memory values,
because it doesn't modify memory, and we only need to make sure that
any writes are delayed to after the conditional.

We long long ago (over 20 years by now) used to do things like this:

  struct __dummy { unsigned long a[100]; };
  #define ADDR (*(volatile struct __dummy *) addr)

      __asm__ __volatile__(
              "btl %2,%1\n\tsbbl %0,%0"
              :"=r" (oldbit)
              :"m" (ADDR),"ir" (nr));

for that test-bit thing. Note how the above doesn't need the memory
clobber, because for gcc that ADDR thing (access to a big struct) ends
up being a "BLKmode" read, and then gcc at least used to treat it as
an arbitrary read.

I forget just why we had to stop using that trick, I think it caused
some reload confusion for some gcc version at some point. Probably
exactly because inline asms had issues with some BLKmode thing. That
change happened before 2001, we didn't have nice changelogs with
detailed commit messages back then, so

> > Anybody have ideas or suggestions for something like that?
>
> Is it useful in general for the kernel to have separate "read" and
> "write" clobbers in asm expressions?  And for other applications?

See above. It's actually not all that uncommon that you have a "this
doesn't modify memory, but you can't move writes around it". It's
usually very much about cache handling or memory ordering operations,
and that bit test example was probably a bad example exactly because
it made it look like it's about some controlled range.

The "write memory barroer" is likely the best and simplest example,
but it's in not the only one.

            Linus