On Wed, 12 Jun 2024 at 17:12, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > While I did not try to figure out who transiently took the lock (it was > something outside of the benchmark), I devised a trivial reproducer > which triggers the problem almost every time: merely issue "ls" of the > directory containing the tested file (in this case: "ls /tmp"). So I have no problem with your patch 2/2 - moving the lockref data structure away from everything else that can be shared read-only makes a ton of sense independently of anything else. Except you also randomly increased a retry count in there, which makes no sense. But this patch 1/2 makes me go "Eww, hacky hacky". We already *have* the retry loop, it's just that currently it only covers the cmpxchg failures. The natural thing to do is to just make the "wait for unlocked" be part of the same loop. In fact, I have this memory of trying this originally, and it not mattering and just making the code uglier, but that may be me confusing myself. It's a *loong* time ago. With the attached patch, lockref_get() (to pick one random case) ends up looking like this: mov (%rdi),%rax mov $0x64,%ecx loop: test %eax,%eax jne locked mov %rax,%rdx sar $0x20,%rdx add $0x1,%edx shl $0x20,%rdx lock cmpxchg %rdx,(%rdi) jne fail // SUCCESS ret locked: pause mov (%rdi),%rax fail: sub $0x1,%ecx jne loop (with the rest being the "take lock and go slow" case). It seems much better to me to have *one* retry loop that handles both the causes of failures. Entirely untested, I only looked at the generated code and it looked reasonable. The patch may be entirely broken for some random reason I didn't think of. And in case you wonder, that 'lockref_locked()' macro I introduce is purely to make the code more readable. Without it, that one conditional line ends up being insanely long, the macro is there just to break things up into slightly more manageable chunks. Mind testing this approach instead? Linus
lib/lockref.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/lib/lockref.c b/lib/lockref.c index 2afe4c5d8919..70f38621901b 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -4,6 +4,9 @@ #if USE_CMPXCHG_LOCKREF +#define lockref_locked(l) \ + unlikely(!arch_spin_value_unlocked((l).lock.rlock.raw_lock)) + /* * Note that the "cmpxchg()" reloads the "old" value for the * failure case. @@ -13,7 +16,12 @@ struct lockref old; \ BUILD_BUG_ON(sizeof(old) != 8); \ old.lock_count = READ_ONCE(lockref->lock_count); \ - while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \ + do { \ + if (lockref_locked(old)) { \ + cpu_relax(); \ + old.lock_count = READ_ONCE(lockref->lock_count); \ + continue; \ + } \ struct lockref new = old; \ CODE \ if (likely(try_cmpxchg64_relaxed(&lockref->lock_count, \ @@ -21,9 +29,7 @@ new.lock_count))) { \ SUCCESS; \ } \ - if (!--retry) \ - break; \ - } \ + } while (--retry); \ } while (0) #else