* Jann Horn <jannh@xxxxxxxxxx> wrote: > On Thu, Nov 30, 2023 at 10:53 PM Waiman Long <longman@xxxxxxxxxx> wrote: > > On 11/30/23 15:48, Jann Horn wrote: > > > I have seen several cases of attempts to use mutex_unlock() to release an > > > object such that the object can then be freed by another task. > > > My understanding is that this is not safe because mutex_unlock(), in the > > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > > > structure after having marked it as unlocked; so mutex_unlock() requires > > > its caller to ensure that the mutex stays alive until mutex_unlock() > > > returns. > > > > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > > > have to keep the mutex alive, I think; but we could have a spurious > > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > > > between the points where __mutex_unlock_slowpath() did the cmpxchg > > > reading the flags and where it acquired the wait_lock. > > > > Could you clarify under what condition a concurrent task can decide to > > free the object holding the mutex? Is it !mutex_is_locked() or after a > > mutex_lock()/mutex_unlock sequence? > > I mean a mutex_lock()+mutex_unlock() sequence. > > > mutex_is_locked() will return true if the mutex has waiter even if it > > is currently free. > > I don't understand your point, and maybe I also don't understand what > you mean by "free". Isn't mutex_is_locked() defined such that it only > looks at whether a mutex has an owner, and doesn't look at the waiter > list? Yeah, mutex_is_locked() is not a sufficient check - and mutexes have no implicit refcount properties like spinlocks. Once you call a mutex API, you have to guarantee the lifetime of the object until the function returns. I.e. entering a mutex_lock()-ed critical section cannot be used to guarantee that all mutex_unlock() instances have stopped using the mutex. I agree that this is a bit unintuitive, and differs from spinlocks. I've clarified all this a bit more in the final patch (added a 'fully' qualifier, etc.), and made the changelog more assertive - see the attached patch. Thanks, Ingo =======================> From: Jann Horn <jannh@xxxxxxxxxx> Date: Thu, 30 Nov 2023 21:48:17 +0100 Subject: [PATCH] locking/mutex: Document that mutex_unlock() is non-atomic I have seen several cases of attempts to use mutex_unlock() to release an object such that the object can then be freed by another task. This is not safe because mutex_unlock(), in the MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex structure after having marked it as unlocked; so mutex_unlock() requires its caller to ensure that the mutex stays alive until mutex_unlock() returns. If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters have to keep the mutex alive, but we could have a spurious MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed between the points where __mutex_unlock_slowpath() did the cmpxchg reading the flags and where it acquired the wait_lock. ( With spinlocks, that kind of code pattern is allowed and, from what I remember, used in several places in the kernel. ) Document this, such a semantic difference between mutexes and spinlocks is fairly unintuitive. [ mingo: Made the changelog a bit more assertive, refined the comments. ] Signed-off-by: Jann Horn <jannh@xxxxxxxxxx> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@xxxxxxxxxx --- Documentation/locking/mutex-design.rst | 6 ++++++ kernel/locking/mutex.c | 5 +++++ 2 files changed, 11 insertions(+) diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst index 78540cd7f54b..7572339b2f12 100644 --- a/Documentation/locking/mutex-design.rst +++ b/Documentation/locking/mutex-design.rst @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: - Detects multi-task circular deadlocks and prints out all affected locks and tasks (and only those tasks). +Releasing a mutex is not an atomic operation: Once a mutex release operation +has begun, another context may be able to acquire the mutex before the release +operation has fully completed. The mutex user must ensure that the mutex is not +destroyed while a release operation is still in progress - in other words, +callers of mutex_unlock() must ensure that the mutex stays alive until +mutex_unlock() has returned. Interfaces ---------- diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 2deeeca3e71b..cbae8c0b89ab 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne * This function must not be used in interrupt context. Unlocking * of a not locked mutex is not allowed. * + * The caller must ensure that the mutex stays alive until this function has + * returned - mutex_unlock() can NOT directly be used to release an object such + * that another concurrent task can free it. + * Mutexes are different from spinlocks & refcounts in this aspect. + * * This function is similar to (but not equivalent to) up(). */ void __sched mutex_unlock(struct mutex *lock)