Re: clear_bit_unlock_is_negative_byte

Michael Schmitz <schmitzmic@xxxxxxxxx> · Fri, 21 Jul 2023 10:37:14 +1200

Hi Matthew,

Am 21.07.2023 um 07:27 schrieb Matthew Wilcox:
I'm looking to implement clear_bit_unlock_is_negative_byte() on every
architecture so we can delete the ifdeffery around maybe-we-have-it
and remove the simple implementation from filemap.c.  Here's what I've
come up with for m68k:

+static inline bool clear_bit_unlock_is_negative_byte(unsigned int nr,
+               volatile unsigned long *p)
+{
+       char result;
+       char mask = 1 << nr;    /* nr guaranteed to be < 7 */
+
+       __asm__ __volatile__ ("eori %1, %2; smi %0"
+               : "=d" (result)
+               : "i" (mask), "o" (*p)
+               : "memory");
+       return result;
+}

It compiles, so I feel Very Pleased With Myself, since I haven't written
m68k assmbly in 25 years.  But I have questions.

Unfortunately, after plugging this in, the boot hangs at the first block 
device encountered:

calling  proc_hardware_init+0x0/0x20 @ 1
initcall proc_hardware_init+0x0/0x20 returned 0 after 40 usecs
calling  rtc_init+0x0/0x5e @ 1
initcall rtc_init+0x0/0x5e returned 0 after 40 usecs
calling  atari_nvram_init+0x0/0x42 @ 1
initcall atari_nvram_init+0x0/0x42 returned 0 after 40 usecs
calling  nfhd_init+0x0/0x206 @ 1
nfhd8: found device with 20971440 blocks (512 bytes)

Doesn't eat up memory, doesn't sit in a tight spin, just never proceeds 
past that point.

Running old and new code side by side (well, actually sequentially) in 
clear_bit_unlock_is_negative_byte()  and comparing results, the return 
values generated by both options match but the actual contents of the 
pointer passed to the function (after the mods) does differ:

nfhd8: found device with 20971440 blocks (512 bytes)
cbu_inb memval mismatch: 2004 12005
cbu_inb memval mismatch: 2004 12005
 nfhd8: AHDI p1 p2
cbu_inb memval mismatch: 26 10027
cbu_inb memval mismatch: 26 10027

(first value is from the old code, second value from the new code).

It looks to me that the eori operates on a 16 bit word here?

(Disclaimer: I have written m68k assembly occasionally in the past 
years, but I struggle with inline asm...)

Cheers,

	Michael

First, m68k is big-endian, so I suspect I'm accessing the wrong byte.
Should something in there be adding 3 to 'p'?  Better to do it in the
asm, or in the constraints so the compiler can see it?

Second, have I properly communicated to the assembler that this is
a byte-size operation, and it needs to check bit 7 and not bits 15 or 31
to set the negative flag?

Third, can this be done better?  x86 has __GCC_ASM_FLAG_OUTPUTS__
so it doesn't need the equivalent of the SMI instruction to move the
condition to an output variable; it can just tell the compiler that
the N flag communicates the result that it's looking for.  Does m68k
have __GCC_ASM_FLAG_OUTPUTS__ or did nobody do that work yet?

Fourth, we could do this is with ANDI instead of EORI.  It's mildly
safer, but we really shouldn't have two threads clearing the lock bit
that race with each other.  We can't do it with BCLR because that
doesn't set the N flag.  If we do that, we'd need to invert the mask.

Appreciate your time looking at this.