This is a note to let you know that I've just added the patch titled futex: Fix more put_pi_state() vs. exit_pi_state_list() races to the 4.13-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: futex-fix-more-put_pi_state-vs.-exit_pi_state_list-races.patch and it can be found in the queue-4.13 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 153fbd1226fb30b8630802aa5047b8af5ef53c9f Mon Sep 17 00:00:00 2001 From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Date: Tue, 31 Oct 2017 11:18:53 +0100 Subject: futex: Fix more put_pi_state() vs. exit_pi_state_list() races From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> commit 153fbd1226fb30b8630802aa5047b8af5ef53c9f upstream. Dmitry (through syzbot) reported being able to trigger the WARN in get_pi_state() and a use-after-free on: raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); Both are due to this race: exit_pi_state_list() put_pi_state() lock(&curr->pi_lock) while() { pi_state = list_first_entry(head); hb = hash_futex(&pi_state->key); unlock(&curr->pi_lock); dec_and_test(&pi_state->refcount); lock(&hb->lock) lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd lock(&curr->pi_lock); .... unlock(&curr->pi_lock); get_pi_state(); // WARN; refcount==0 The problem is we take the reference count too late, and don't allow it being 0. Fix it by using inc_not_zero() and simply retrying the loop when we fail to get a refcount. In that case put_pi_state() should remove the entry from the list. Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Gratian Crisan <gratian.crisan@xxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: dvhart@xxxxxxxxxxxxx Cc: syzbot <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@xxxxxxxxxxxxxxxxxxxxxxxxx> Cc: syzkaller-bugs@xxxxxxxxxxxxxxxx Fixes: c74aef2d06a9 ("futex: Fix pi_state->owner serialization") Link: http://lkml.kernel.org/r/20171031101853.xpfh72y643kdfhjs@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- kernel/futex.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) --- a/kernel/futex.c +++ b/kernel/futex.c @@ -901,11 +901,27 @@ void exit_pi_state_list(struct task_stru */ raw_spin_lock_irq(&curr->pi_lock); while (!list_empty(head)) { - next = head->next; pi_state = list_entry(next, struct futex_pi_state, list); key = pi_state->key; hb = hash_futex(&key); + + /* + * We can race against put_pi_state() removing itself from the + * list (a waiter going away). put_pi_state() will first + * decrement the reference count and then modify the list, so + * its possible to see the list entry but fail this reference + * acquire. + * + * In that case; drop the locks to let put_pi_state() make + * progress and retry the loop. + */ + if (!atomic_inc_not_zero(&pi_state->refcount)) { + raw_spin_unlock_irq(&curr->pi_lock); + cpu_relax(); + raw_spin_lock_irq(&curr->pi_lock); + continue; + } raw_spin_unlock_irq(&curr->pi_lock); spin_lock(&hb->lock); @@ -916,8 +932,10 @@ void exit_pi_state_list(struct task_stru * task still owns the PI-state: */ if (head->next != next) { + /* retain curr->pi_lock for the loop invariant */ raw_spin_unlock(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); + put_pi_state(pi_state); continue; } @@ -925,9 +943,8 @@ void exit_pi_state_list(struct task_stru WARN_ON(list_empty(&pi_state->list)); list_del_init(&pi_state->list); pi_state->owner = NULL; - raw_spin_unlock(&curr->pi_lock); - get_pi_state(pi_state); + raw_spin_unlock(&curr->pi_lock); raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); Patches currently in stable-queue which might be from peterz@xxxxxxxxxxxxx are queue-4.13/futex-fix-more-put_pi_state-vs.-exit_pi_state_list-races.patch queue-4.13/perf-cgroup-fix-perf-cgroup-hierarchy-support.patch