On pe, 2015-11-06 at 08:54 +0000, Chris Wilson wrote: > On Fri, Nov 06, 2015 at 12:57:35AM +0200, Imre Deak wrote: > > On Thu, 2015-11-05 at 11:56 +0000, Chris Wilson wrote: > > > On Thu, Nov 05, 2015 at 01:28:32PM +0200, Imre Deak wrote: > > > > On ke, 2015-11-04 at 20:57 +0000, Chris Wilson wrote: > > > > > On Wed, Nov 04, 2015 at 09:25:32PM +0200, Imre Deak wrote: > > > > > > After Damien's D3 fix I started to get runtime suspend > > > > > > residency for the > > > > > > first time and that revealed a breakage on the set_caching > > > > > > IOCTL path > > > > > > that accesses the HW but doesn't take an RPM ref. Fix this > > > > > > up. > > > > > > > > > > Why here (and in so many random places) and not around the > > > > > PTE write > > > > > itself? > > > > > > > > Imo we should take the RPM ref outside of any of our locks. > > > > Otherwise we > > > > need hacks like we have currently in the runtime suspend > > > > handler to work > > > > around lock inversions. It works now, but we couldn't do the > > > > same trick > > > > if we needed to take struct_mutex for example in the resume > > > > handler too > > > > for some reason. > > > > > > On the other hand, it leads to hard to track down bugs and > > > improper > > > documentation. Neither solution is perfect. > > > > I think I understand the documentation part, not sure how that > > could be > > solved. Perhaps RPM-held asserts right before the point where the > > HW > > needs to be on? > > > > What do you mean by hard to track down bugs? Couldn't that also be > > tackled by asserts? > > > > > Note since intel_runtime_suspend has ample barriers of its own, > > > you can > > > avoid the struct_mutex if you have a dedicated dev_priv > > > ->mm.fault_list. > > > > > > Something along the lines of: > > > > Ok, looks interesting. But as you said we would have to then make > > callers of i915_gem_release_mmap() wake up the device, which isn't > > strictly necessary. (and imo as such goes somewhat against the > > documentation argument) > > Outside of suspend, we only revoke the CPU PTE mapping when we change > the hardware (as doing so is very expensive). So all callers should > already have a reason (and be holding) the rpm wakelock, the only > complication from my point of view is enforcing that and reviewing > that > what I said is true. Looked through it, it seems only i915_gem_set_tiling() could release the PTE's without waking up the hardware (if no need to unbind the object). Otherwise it's true that all callers hold (or should hold) already an RPM ref. To fix the set tiling case to work after your optimization we could wake up the HW unconditionally there, use a no_resume RPM ref+and RPM barrier or a separate new lock for the fault list. > From the rpm point of view, this should improve the success of > runtime suspend, and reduce wakelocks. Yes, seems like a worthy optimization, since I assume struct_mutex can be held for a long time without the need to wake up the hardware. Are you ok to first have the fix I posted and a similar one for i915_gem_set_tiling()? And then to follow-up with your plan. --Imre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx