On Wed, 2022-06-22 at 15:19 -0700, Matt Roper wrote: > On Tue, Jun 21, 2022 at 10:03:04AM -0700, Souza, Jose wrote: > > On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote: > > > On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote: > > > > Gem buffers could still be in use by display after i915_gem_suspend() > > > > is executed so there is chances that i915_gem_flush_free_objects() > > > > will be being executed at the same time that > > > > intel_runtime_pm_driver_release() is executed printing warnings about > > > > wakerefs will being held. > > > > > > By the same logic do we need to adjust i915_driver_remove() too? > > > > Nope, all display buffers are freed in i915_driver_unregister() call chain: > > > > > > i915_driver_remove() > > i915_driver_unregister() > > intel_display_driver_unregister() > > drm_atomic_helper_shutdown() > > i915_gem_suspend() > > i915_gem_drain_freed_objects() > > > > > > Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory. > > Okay sounds good; thanks for checking. > > I'm still having a bit of trouble understanding your description of the > issue in the commit message though: > > "...so there is chances that i915_gem_flush_free_objects() will > be being executed at the same time that > intel_runtime_pm_driver_release()..." > > I'm not super familiar with the driver teardown paths, or the memory > management cleanup details. Intuitively it makes sense that we should > clean up memory management (GEM) only after we've torn down display so > that all objects that were used by framebuffers are out of circulation. > But from a cursory view, it looks like i915_gem_suspend() is mostly > concerned with quiescing the GT and cleaning up PPGTT (which doesn't > impact display since all of its buffers are in the GGTT). > > Is the problem arising from i915->mm.free_work still doing asynchronous > work to actually release the unused objects at the same time we're > tearing down runtime PM later? If so does swapping the order of the > gem_suspend and display disable here actually prevent that from > happening or does it just make the race less likely by helping some > objects free up earlier? So when the last reference of a gem object is removed it is added to the mm.free_list list and mm.free_work is queued to actually free the object. i915_gem_drain_freed_objects() flushes the mm.free_work. If any other gem object has its last reference removed after i915_gem_suspend()/i915_gem_drain_freed_objects() the warning in intel_runtime_pm_driver_release() can happen as the mm.free_work could be running at the same time. But when pci_driver.remove() is called, probably all file descriptors attached to this device have been closed and the functions called after i915_gem_suspend() will not free any gem object, so I don't believe we will have any more warnings. > > > Matt > > > > > > > > > > > > Matt > > > > > > > > > > > So here only calling i915_gem_suspend() and by consequence > > > > i915_gem_drain_freed_objects() only after display is down making > > > > sure all buffers are freed. > > > > > > > > Signed-off-by: José Roberto de Souza <jose.souza@xxxxxxxxx> > > > > --- > > > > drivers/gpu/drm/i915/i915_driver.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c > > > > index d26dcca7e654a..4227675dd1cfe 100644 > > > > --- a/drivers/gpu/drm/i915/i915_driver.c > > > > +++ b/drivers/gpu/drm/i915/i915_driver.c > > > > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915) > > > > intel_runtime_pm_disable(&i915->runtime_pm); > > > > intel_power_domains_disable(i915); > > > > > > > > - i915_gem_suspend(i915); > > > > - > > > > if (HAS_DISPLAY(i915)) { > > > > drm_kms_helper_poll_disable(&i915->drm); > > > > > > > > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915) > > > > > > > > intel_dmc_ucode_suspend(i915); > > > > > > > > + i915_gem_suspend(i915); > > > > + > > > > /* > > > > * The only requirement is to reboot with display DC states disabled, > > > > * for now leaving all display power wells in the INIT power domain > > > > -- > > > > 2.36.1 > > > > > > > > > >