Peter Zijlstra wrote on Mon, Jan 04, 2016 at 04:59:15PM +0100: > On Tue, Dec 29, 2015 at 10:43:26PM -0800, Andy Lutomirski wrote: > > [add cc's] > > > > Hi scheduler people: > > > > This is relatively easy for me to reproduce. Any hints for debugging > > it? Could we really have a bug in which processes that are > > schedulable as a result of mutex unlock aren't always reliably > > scheduled? > > I would expect that to cause wide-spread fail, then again, virt is known > to tickle timing issues that are improbable on actual hardware so > anything is possible. > > Does it reproduce with DEBUG_MUTEXES set? (I'm not seeing a .config > here). The config has CONFIG_DEBUG_MUTEXES=y It got attached a while ago, reposting it here. > If its really easy you could start by tracing events/sched/sched_switch > events/sched/sched_wakeup, those would be the actual scheduling events. I'm sure I've missed something in /Documentation but I'm not aware how to trace these? (I'm happy to save Andy some precious time as I've got a reproducer all set up now) > Without DEBUG_MUTEXES there's the MUTEX_SPIN_ON_OWNER code that could > still confuse things, but that's mutex internal and not scheduler > related. > > If it ends up being the SPIN_ON_OWNER bits we'll have to cook up some > extra debug patches. -- Dominique
Attachment:
bad-config.xz
Description: Binary data