Re: clear_bit_unlock_is_negative_byte

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 21 Jul 2023 18:03:23 +0100

On Fri, Jul 21, 2023 at 01:43:06PM +1200, Michael Schmitz wrote:
Ah, it's not supposed to be cleared.  The way this works is that bit 0
is the lock bit; if someone's waiting on the folio, they set bit 7.  If
bit 7 is set when we clear bit 0, we look on the wait queue.  If there's
nobody on the wait queue, we clear bit 7.

Right, that's what I meant to say. I'd only seen cases where bit 0 had been
set and was cleared. This isn't an actual production system of sorts, just
an ARAnyM instance I can fire up quickly to see patched kernels crash
horribly.

Well, I appreciate the testing!

This is what I have tests running on right now:

static inline bool clear_bit_unlock_is_negative_byte(unsigned int nr,
                volatile unsigned long *p)
{
        unsigned char *cp = (unsigned char *) p;
        char result;
        char mask = 1 << nr;    /* nr guaranteed to be < 7 */

        __asm__ __volatile__ ("eori.b %1, %2; smi %0"
                : "=d" (result)
                : "i" (mask), "o" (*(cp+3))
                : "memory");
        return result;
}

I thought it a little odd to use an unsigned char when we're testing
to see if it's negative, so I went with this:

static inline bool clear_bit_unlock_is_negative_byte(unsigned int nr,
                volatile unsigned long *p)
{
        char result;
        char mask = 1 << nr;            /* nr guaranteed to be < 7 */
        char *cp = (char *)p + 3;       /* m68k is big-endian */

        __asm__ __volatile__ ("eori.b %1, %2; smi %0"
                : "=d" (result)
                : "i" (mask), "o" (*cp)
                : "memory");
        return result;
}

I'm sure you can do all the casting to char and increment by 3 in the asm
argument...

I'd rather not.  I looked at doing the offset by three inside the asm,
but it seems like gcc is smart enough to do that without help:

000006e0 <folio_unlock>:
     6e0:       206f 0004       moveal %sp@(4),%a0
     6e4:       0a28 0001 0003  eorib #1,%a0@(3)
     6ea:       5bc0            smi %d0
     6ec:       4a00            tstb %d0
     6ee:       670a            beqs 6fa <folio_unlock+0x1a>
     6f0:       42a7            clrl %sp@-
     6f2:       2f08            movel %a0,%sp@-
     6f4:       4eba fcec       jsr %pc@(3e2 <folio_wake_bit>)
     6f8:       508f            addql #8,%sp
     6fa:       4e75            rts

You'll note the smi/tstb pair are unnecessary.  It could simply BPL to
the RTS instruction, but we can't tell GCC that because we don't have
the __GCC_ASM_FLAG_OUTPUTS__ feature.

By the way, before this optimisation, it was this:

000006fc <folio_unlock>:
     6fc:       206f 0004       moveal %sp@(4),%a0
     700:       08a8 0000 0003  bclr #0,%a0@(3)
     706:       2010            movel %a0@,%d0
     708:       4a00            tstb %d0
     70a:       6c0a            bges 716 <folio_unlock+0x1a>
     70c:       42a7            clrl %sp@-
     70e:       2f08            movel %a0,%sp@-
     710:       4eba fcd0       jsr %pc@(3e2 <folio_wake_bit>)
     714:       508f            addql #8,%sp
     716:       4e75            rts

which is the same number of instructions, but one more memory reference.
It's a read-after-write hazard, but I don't know if that affects any
m68k implementation; my impression is that even on an '060 there aren't
any real performance implications.  Kudos to gcc for figuring out that
testing bit 7 can be done with the tstb instruction.

If there's a simple way to exercise this code path using standard Unix tools
(or stress-ng which I ought to have somewhere), drop me a hint.

Oh, it's so common to have a waiter on a folio unlock that just making
it to the login prompt is enough to declare comfidently that this works.
CPU implementations with memory barriers and such fanciness are a little
harder to be confident in, but this looks good to me.  I generally run
xfstests, but that's just because I have it all set up and ready to go.

I'll drop your Tested-by on this if that's OK?  If you want a
Co-developed-by credit, that's fine with me too!