On Wed, 22 Nov 2023 at 09:20, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > If you want to do the optimization that the compiler misses by hand, > it would be something like the attached patch. Bah. Might as well do the reference decrements with the same logic, not just the increments. Of course, this is much more noticeable with the ticket locks, because with the qspinlocks the "is this unlocked" test will check whether the lock is all zeroes. So with qspinlocks, the compiler sees that "oh, the low 32 bits are zero", and the whole "merge the two words back to 64 bits" is much cheaper, and doesn't generate quite the mess that it does for RISC-V with ticket locks. But this "treat the lockref as a 64-bit entity" thing is probably a good thing on most 64-bit architectures, including x86 that has that qspinlock thing. Still not actually tested, but the code generation on x86 looks reasonable, so it migth be worth looking at whether it helps the RISC-V case. Linus
diff --git a/lib/lockref.c b/lib/lockref.c index 2afe4c5d8919..56f4419f593d 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -26,6 +26,17 @@ } \ } while (0) +/* + * The compiler isn't smart enough to the the count + * increment in the high 32 bits of the 64-bit value, + * so do this optimization by hand. + */ +#if defined(__LITTLE_ENDIAN) && BITS_PER_LONG == 64 + #define LOCKREF_ADD(n,x) ((n).lock_count += (unsigned long)(x)<<32) +#else + #define LOCKREF_ADD(n,x) ((n).count += (unsigned long)(x)<<32) +#endif + #else #define CMPXCHG_LOOP(CODE, SUCCESS) do { } while (0) @@ -42,7 +53,7 @@ void lockref_get(struct lockref *lockref) { CMPXCHG_LOOP( - new.count++; + LOCKREF_ADD(new,1); , return; ); @@ -63,7 +74,7 @@ int lockref_get_not_zero(struct lockref *lockref) int retval; CMPXCHG_LOOP( - new.count++; + LOCKREF_ADD(new,1); if (old.count <= 0) return 0; , @@ -91,7 +102,7 @@ int lockref_put_not_zero(struct lockref *lockref) int retval; CMPXCHG_LOOP( - new.count--; + LOCKREF_ADD(new,-1); if (old.count <= 1) return 0; , @@ -119,7 +130,7 @@ EXPORT_SYMBOL(lockref_put_not_zero); int lockref_put_return(struct lockref *lockref) { CMPXCHG_LOOP( - new.count--; + LOCKREF_ADD(new,-1); if (old.count <= 0) return -1; , @@ -137,7 +148,7 @@ EXPORT_SYMBOL(lockref_put_return); int lockref_put_or_lock(struct lockref *lockref) { CMPXCHG_LOOP( - new.count--; + LOCKREF_ADD(new,-1); if (old.count <= 1) break; , @@ -174,7 +185,7 @@ int lockref_get_not_dead(struct lockref *lockref) int retval; CMPXCHG_LOOP( - new.count++; + LOCKREF_ADD(new,1); if (old.count < 0) return 0; ,