* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, 11 Apr 2024 at 06:33, tip-bot2 for Uros Bizjak > <tip-bot2@xxxxxxxxxxxxx> wrote: > > > > Use try_cmpxchg_acquire(*ptr, &old, new) instead of > > cmpxchg_relaxed(*ptr, old, new) == old in trylock_clear_pending(). > > The above commit message is horribly confusing and wrong. > > I was going "that's not right", because it says "use acquire instead > of relaxed" memory ordering, and then goes on to say "No functional > change intended". > > But it turns out the *code* was always acquire, and it's only the > commit message that is wrong, presumably due to a bit too much > cut-and-paste. Yeah, the replacement is cmpxchg_acquire() => try_cmpxchg_acquire(), with no change to memory ordering. > But please fix the commit message, and use the right memory ordering > in the explanations too. Done, find below the new patch, which a hopefully better commit message. I also added your Reviewed-by tag optimistically :) Thanks, Ingo ===========================> From: Uros Bizjak <ubizjak@xxxxxxxxx> Date: Mon, 25 Mar 2024 15:09:32 +0100 Subject: [PATCH] locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending() Replace this pattern in trylock_clear_pending(): cmpxchg_acquire(*ptr, old, new) == old .. with the simpler and faster: try_cmpxchg_acquire(*ptr, &old, new) The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. Also change the return type of the function to bool and streamline the control flow in the _Q_PENDING_BITS == 8 variant a bit. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@xxxxxxxxx> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Reviewed-by: Waiman Long <longman@xxxxxxxxxx> Reviewed-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Link: https://lore.kernel.org/r/20240325140943.815051-1-ubizjak@xxxxxxxxx --- kernel/locking/qspinlock_paravirt.h | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index 169950fe1aad..77ba80bd95f9 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -116,11 +116,12 @@ static __always_inline void set_pending(struct qspinlock *lock) * barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the * lock just to be sure that it will get it. */ -static __always_inline int trylock_clear_pending(struct qspinlock *lock) +static __always_inline bool trylock_clear_pending(struct qspinlock *lock) { + u16 old = _Q_PENDING_VAL; + return !READ_ONCE(lock->locked) && - (cmpxchg_acquire(&lock->locked_pending, _Q_PENDING_VAL, - _Q_LOCKED_VAL) == _Q_PENDING_VAL); + try_cmpxchg_acquire(&lock->locked_pending, &old, _Q_LOCKED_VAL); } #else /* _Q_PENDING_BITS == 8 */ static __always_inline void set_pending(struct qspinlock *lock) @@ -128,27 +129,21 @@ static __always_inline void set_pending(struct qspinlock *lock) atomic_or(_Q_PENDING_VAL, &lock->val); } -static __always_inline int trylock_clear_pending(struct qspinlock *lock) +static __always_inline bool trylock_clear_pending(struct qspinlock *lock) { - int val = atomic_read(&lock->val); - - for (;;) { - int old, new; - - if (val & _Q_LOCKED_MASK) - break; + int old, new; + old = atomic_read(&lock->val); + do { + if (old & _Q_LOCKED_MASK) + return false; /* * Try to clear pending bit & set locked bit */ - old = val; - new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL; - val = atomic_cmpxchg_acquire(&lock->val, old, new); + new = (old & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL; + } while (!atomic_try_cmpxchg_acquire (&lock->val, &old, new)); - if (val == old) - return 1; - } - return 0; + return true; } #endif /* _Q_PENDING_BITS == 8 */