Re: [PATCH] locking: Document that mutex_unlock() is non-atomic

Waiman Long <longman@xxxxxxxxxx> · Thu, 30 Nov 2023 19:33:46 -0500

On 11/30/23 15:48, Jann Horn wrote:
I have seen several cases of attempts to use mutex_unlock() to release an
object such that the object can then be freed by another task.
My understanding is that this is not safe because mutex_unlock(), in the
MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
structure after having marked it as unlocked; so mutex_unlock() requires
its caller to ensure that the mutex stays alive until mutex_unlock()
returns.

If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
have to keep the mutex alive, I think; but we could have a spurious
MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
between the points where __mutex_unlock_slowpath() did the cmpxchg
reading the flags and where it acquired the wait_lock.

(With spinlocks, that kind of code pattern is allowed and, from what I
remember, used in several places in the kernel.)

If my understanding of this is correct, we should probably document this -
I think such a semantic difference between mutexes and spinlocks is fairly
unintuitive.

Spinlocks are fair. So doing a lock/unlock sequence will make sure that 
all the previously waiting waiters are done with the lock. Para-virtual 
spinlocks, however, can be a bit unfair so doing a lock/unlock sequence 
may not be enough to guarantee there is no waiter. The same is true for 
mutex. Adding a spin_is_locked() or mutex_is_locked() check can make 
sure that all the waiters are gone.

Also the term "non-atomc" is kind of ambiguous as to what is the exact 
meaning of this word.

Cheers,
Longman


Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
---
I hope for some thorough review on this patch to make sure the comments
I'm adding are actually true, and to confirm that mutexes intentionally
do not support this usage pattern.

  Documentation/locking/mutex-design.rst | 6 ++++++
  kernel/locking/mutex.c                 | 5 +++++
  2 files changed, 11 insertions(+)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 78540cd7f54b..087716bfa7b2 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
      - Detects multi-task circular deadlocks and prints out all affected
        locks and tasks (and only those tasks).
  
+Releasing a mutex is not an atomic operation: Once a mutex release operation
+has begun, another context may be able to acquire the mutex before the release
+operation has completed. The mutex user must ensure that the mutex is not
+destroyed while a release operation is still in progress - in other words,
+callers of 'mutex_unlock' must ensure that the mutex stays alive until
+'mutex_unlock' has returned.
  
  Interfaces
  ----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 2deeeca3e71b..4c6b83bab643 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
   * This function must not be used in interrupt context. Unlocking
   * of a not locked mutex is not allowed.
   *
+ * The caller must ensure that the mutex stays alive until this function has
+ * returned - mutex_unlock() can NOT directly be used to release an object such
+ * that another concurrent task can free it.
+ * Mutexes are different from spinlocks in this aspect.
+ *
   * This function is similar to (but not equivalent to) up().
   */
  void __sched mutex_unlock(struct mutex *lock)

base-commit: 3b47bc037bd44f142ac09848e8d3ecccc726be99