On Tue, Jan 12, 2016 at 05:11:00PM +0100, Daniel Vetter wrote: > On Tue, Jan 12, 2016 at 05:59:46PM +0200, Imre Deak wrote: > > On ti, 2016-01-12 at 16:35 +0100, Daniel Vetter wrote: > > > On Tue, Jan 12, 2016 at 02:21:51PM +0000, John Harrison wrote: > > > > On 12/01/2016 14:04, Daniel Vetter wrote: > > > > > On Tue, Jan 12, 2016 at 11:50:34AM +0000, John Harrison wrote: > > > > > > On 12/01/2016 11:28, Chris Wilson wrote: > > > > > > > On Tue, Jan 12, 2016 at 11:11:20AM +0000, John Harrison > > > > > > > wrote: > > > > > > > > On 12/01/2016 00:20, Chris Wilson wrote: > > > > > > > > > On Mon, Jan 11, 2016 at 06:42:31PM +0000, John.C.Harrison > > > > > > > > > @Intel.com wrote: > > > > > > > > > > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > > > > > > > > > > > > > > > > > > > A later patch in this series re-organises the batch > > > > > > > > > > buffer submission > > > > > > > > > > code. Part of that is to reduce the scope of a > > > > > > > > > > pm_get/put pair. > > > > > > > > > > Specifically, they previously wrapped the entire > > > > > > > > > > submission path from > > > > > > > > > > the very start to the very end, now they only wrap the > > > > > > > > > > actual hardware > > > > > > > > > > submission part in the back half. > > > > > > > > > However, as you haven't fixed the ordering issue that > > > > > > > > > requires rpm_get > > > > > > > > > before struct_mutex, this is broken. > > > > > > > > Why does 'intel_runtime_pm_get' require the struct mutex to > > > > > > > > be held? > > > > > > > > It has certainly not complained at me about trying to do > > > > > > > > stuff > > > > > > > > without it. > > > > > > > Because it depends upon the struct_mutex and rpm doesn't have > > > > > > > sufficient > > > > > > > lockdep integration to be able to warn about using rpm from > > > > > > > the > > > > > > > incorrect contexts. > > > > > > Where? What does the 'pm_runtime_get_sync' call turn into? > > > > > > There are already > > > > > > other places in the driver which call intel_runtime_pm_get() > > > > > > immediately > > > > > > after grabbing the mutex lock. Also, the description comment > > > > > > for _pm_get() > > > > > > does not mention anything about mutexes at all. > > > > > If you nest rpm_get within dev->struct_mutex that's a bug and > > > > > could > > > > > deadlock. Where does this happen? And for any such place we need > > > > > a new > > > > > subtest in pm_rpm. > > > > > > > > The first two hits when grepping the driver are in > > > > 'i915_gem_seqno_info()' > > > > and 'i915_interrupt_info()' in i915_debugfs.c. Both say: > > > > ret = mutex_lock_interruptible(&dev->struct_mutex); > > > > if (ret) > > > > return ret; > > > > intel_runtime_pm_get(dev_priv); > > > > > > Yeah that's totally bonkers and will deadlock if the device is > > > actually > > > suspend. > > > > It won't deadlock, runtime resume doesn't need struct mutex. Runtime > > suspend needs it, but we return -EAGAIN from there if it's already > > held. That in turn will delay the suspend. > > You're right, why do I always mix this up. But this just got me thinking > about possible lockdep annotations. We could place a > might_lock(rpm_lockdep_key) within rpm_get. And fake-acquire that around > the actual ->resume callback. We could even do this in the rpm core ... Or we could make progress on eliminating the need for struct_mutex in rpm resume/suspend/whenevner :) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx