Op 23-11-16 om 14:11 schreef Daniel Vetter: > On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote: >> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote: >>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote: >>>> From: Nicolai Hähnle <Nicolai.Haehnle@xxxxxxx> >>>> >>>> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in >>>> the following example. Acquire context stamps are ordered like the thread >>>> numbers, i.e. thread #1 should back off when it encounters a mutex locked >>>> by thread #0 etc. >>>> >>>> Thread #0 Thread #1 Thread #2 Thread #3 >>>> --------- --------- --------- --------- >>>> lock(ww) >>>> success >>>> lock(ww') >>>> success >>>> lock(ww) >>>> lock(ww) . >>>> . . unlock(ww) part 1 >>>> lock(ww) . . . >>>> success . . . >>>> . . unlock(ww) part 2 >>>> . back off >>>> lock(ww') . >>>> . . >>>> (stuck) (stuck) >>>> >>>> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1 >>>> (without being protected by lock->base.wait_lock), meaning that thread #0 >>>> can acquire ww in the fast path or, much more likely, the medium path >>>> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then >>>> won't wake up any of the waiters in ww_mutex_set_context_fastpath. >>>> >>>> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is >>>> thread #2, since waiters are added at the tail. Thread #2 wakes up and >>>> backs off since it sees ww owned by a context with a lower stamp. >>>> >>>> Meanwhile, thread #1 is never woken up, and so it won't back off its lock >>>> on ww'. So thread #0 gets stuck waiting for ww' to be released. >>>> >>>> This patch fixes the deadlock by waking up all waiters in the slow path >>>> of ww_mutex_unlock. >>>> >>>> We have an internal test case for amdgpu which continuously submits >>>> command streams from tens of threads, where all command streams reference >>>> hundreds of GPU buffer objects with a lot of overlap in the buffer lists >>>> between command streams. This test reliably caused a deadlock, and while I >>>> haven't completely confirmed that it is exactly the scenario outlined >>>> above, this patch does fix the test case. >>>> >>>> v2: >>>> - use wake_q_add >>>> - add additional explanations >>>> >>>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >>>> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >>>> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >>>> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx> >>>> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx >>>> Cc: stable@xxxxxxxxxxxxxxx >>>> Reviewed-by: Christian König <christian.koenig@xxxxxxx> (v1) >>>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@xxxxxxx> >>> Completely and utterly fails to apply; I think this patch is based on >>> code prior to the mutex rewrite. >>> >>> Please rebase on tip/locking/core. >>> >>> Also, is this a regression, or has this been a 'feature' of the ww_mutex >>> code from early on? >> Sorry forgot to mention that, but I checked. Seems to have been broken >> since day 1, at least looking at the original code the wake-single-waiter >> stuff is as old as the mutex code added in 2006. > More details: For gpu drivers this was originally working, since the > ww_mutex implementation in ttm did use wake_up_all. So need to add a > > Fixes: 5e338405119a ("drm/ttm: convert to the reservation api") But it's broken in the original kernel ww_mutex implementation, I guess it doesn't matter much since ttm was the first in-kernel user anyway. :) ~Maarten -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html