On Thu, Jun 23, 2022 at 07:48:32AM -0700, Souza, Jose wrote: > On Wed, 2022-06-22 at 15:19 -0700, Matt Roper wrote: > > On Tue, Jun 21, 2022 at 10:03:04AM -0700, Souza, Jose wrote: > > > On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote: > > > > On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote: > > > > > Gem buffers could still be in use by display after i915_gem_suspend() > > > > > is executed so there is chances that i915_gem_flush_free_objects() > > > > > will be being executed at the same time that > > > > > intel_runtime_pm_driver_release() is executed printing warnings about > > > > > wakerefs will being held. > > > > > > > > By the same logic do we need to adjust i915_driver_remove() too? > > > > > > Nope, all display buffers are freed in i915_driver_unregister() call chain: > > > > > > > > > i915_driver_remove() > > > i915_driver_unregister() > > > intel_display_driver_unregister() > > > drm_atomic_helper_shutdown() > > > i915_gem_suspend() > > > i915_gem_drain_freed_objects() > > > > > > > > > Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory. > > > > Okay sounds good; thanks for checking. > > > > I'm still having a bit of trouble understanding your description of the > > issue in the commit message though: > > > > "...so there is chances that i915_gem_flush_free_objects() will > > be being executed at the same time that > > intel_runtime_pm_driver_release()..." > > > > I'm not super familiar with the driver teardown paths, or the memory > > management cleanup details. Intuitively it makes sense that we should > > clean up memory management (GEM) only after we've torn down display so > > that all objects that were used by framebuffers are out of circulation. > > But from a cursory view, it looks like i915_gem_suspend() is mostly > > concerned with quiescing the GT and cleaning up PPGTT (which doesn't > > impact display since all of its buffers are in the GGTT). > > > > Is the problem arising from i915->mm.free_work still doing asynchronous > > work to actually release the unused objects at the same time we're > > tearing down runtime PM later? If so does swapping the order of the > > gem_suspend and display disable here actually prevent that from > > happening or does it just make the race less likely by helping some > > objects free up earlier? > > So when the last reference of a gem object is removed it is added to the mm.free_list list and mm.free_work is queued to actually free the object. > i915_gem_drain_freed_objects() flushes the mm.free_work. > > If any other gem object has its last reference removed after i915_gem_suspend()/i915_gem_drain_freed_objects() the warning in > intel_runtime_pm_driver_release() can happen as the mm.free_work could be running at the same time. > > But when pci_driver.remove() is called, probably all file descriptors attached to this device have been closed and the functions called after > i915_gem_suspend() will not free any gem object, so I don't believe we will have any more warnings. Okay, thanks for explaining, makes sense. You might want to add some of this extra explanation to the commit message too for future reference, but either way, Reviewed-by: Matt Roper <matthew.d.roper@xxxxxxxxx> > > > > > > > Matt > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > So here only calling i915_gem_suspend() and by consequence > > > > > i915_gem_drain_freed_objects() only after display is down making > > > > > sure all buffers are freed. > > > > > > > > > > Signed-off-by: José Roberto de Souza <jose.souza@xxxxxxxxx> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_driver.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c > > > > > index d26dcca7e654a..4227675dd1cfe 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_driver.c > > > > > +++ b/drivers/gpu/drm/i915/i915_driver.c > > > > > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915) > > > > > intel_runtime_pm_disable(&i915->runtime_pm); > > > > > intel_power_domains_disable(i915); > > > > > > > > > > - i915_gem_suspend(i915); > > > > > - > > > > > if (HAS_DISPLAY(i915)) { > > > > > drm_kms_helper_poll_disable(&i915->drm); > > > > > > > > > > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915) > > > > > > > > > > intel_dmc_ucode_suspend(i915); > > > > > > > > > > + i915_gem_suspend(i915); > > > > > + > > > > > /* > > > > > * The only requirement is to reboot with display DC states disabled, > > > > > * for now leaving all display power wells in the INIT power domain > > > > > -- > > > > > 2.36.1 > > > > > > > > > > > > > > > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation