On Tue, Dec 3, 2024 at 3:25 AM Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> wrote: > > On Tue, Dec 03, 2024 at 10:20:23AM +0200, Ville Syrjälä wrote: > > On Mon, Dec 02, 2024 at 10:40:36AM -0500, Brian Geffon wrote: > > > On Wed, Nov 27, 2024 at 1:11 AM Ville Syrjala > > > <ville.syrjala@xxxxxxxxxxxxxxx> wrote: > > > > > > > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > > > Currently intel_dpt_resume() tries to blindly rewrite all the > > > > PTEs for currently bound DPT VMAs. That is problematic because > > > > the CPU mapping for the DPT is only really guaranteed to exist > > > > while the DPT object has been pinned. In the past we worked > > > > around this issue by making DPT objects unshrinkable, but that > > > > is undesirable as it'll waste physical RAM. > > > > > > > > Let's instead forcefully evict all the DPT VMAs on suspend, > > > > thus guaranteeing that intel_dpt_resume() has nothing to do. > > > > To guarantee that all the DPT VMAs are evictable by > > > > intel_dpt_suspend() we need to flush the cleanup workqueue > > > > after the display output has been shut down. > > > > > > > > And for good measure throw in a few extra WARNs to catch > > > > any mistakes. > > > > > > > > Cc: Brian Geffon <bgeffon@xxxxxxxxxx> > > > > Cc: Vidya Srinivas <vidya.srinivas@xxxxxxxxx> > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > --- > > > > .../drm/i915/display/intel_display_driver.c | 3 +++ > > > > drivers/gpu/drm/i915/display/intel_dpt.c | 4 ++-- > > > > drivers/gpu/drm/i915/gt/intel_ggtt.c | 19 ++++++++++++++----- > > > > drivers/gpu/drm/i915/gt/intel_gtt.h | 4 ++-- > > > > 4 files changed, 21 insertions(+), 9 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c > > > > index 286d6f893afa..973bee43b554 100644 > > > > --- a/drivers/gpu/drm/i915/display/intel_display_driver.c > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_driver.c > > > > @@ -680,6 +680,9 @@ int intel_display_driver_suspend(struct drm_i915_private *i915) > > > > else > > > > i915->display.restore.modeset_state = state; > > > > > > > > + /* ensure all DPT VMAs have been unpinned for intel_dpt_suspend() */ > > > > + flush_workqueue(i915->display.wq.cleanup); > > > > + > > > > intel_dp_mst_suspend(i915); > > > > > > > > return ret; > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c > > > > index ce8c76e44e6a..8b1f0e92a11c 100644 > > > > --- a/drivers/gpu/drm/i915/display/intel_dpt.c > > > > +++ b/drivers/gpu/drm/i915/display/intel_dpt.c > > > > @@ -205,7 +205,7 @@ void intel_dpt_resume(struct drm_i915_private *i915) > > > > struct intel_framebuffer *fb = to_intel_framebuffer(drm_fb); > > > > > > > > if (fb->dpt_vm) > > > > - i915_ggtt_resume_vm(fb->dpt_vm); > > > > + i915_ggtt_resume_vm(fb->dpt_vm, true); > > > > } > > > > mutex_unlock(&i915->drm.mode_config.fb_lock); > > > > } > > > > @@ -233,7 +233,7 @@ void intel_dpt_suspend(struct drm_i915_private *i915) > > > > struct intel_framebuffer *fb = to_intel_framebuffer(drm_fb); > > > > > > > > if (fb->dpt_vm) > > > > - i915_ggtt_suspend_vm(fb->dpt_vm); > > > > + i915_ggtt_suspend_vm(fb->dpt_vm, true); > > > > } > > > > > > > > mutex_unlock(&i915->drm.mode_config.fb_lock); > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > > index d60a6ca0cae5..f6c59f20832f 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > > @@ -107,11 +107,12 @@ int i915_ggtt_init_hw(struct drm_i915_private *i915) > > > > /** > > > > * i915_ggtt_suspend_vm - Suspend the memory mappings for a GGTT or DPT VM > > > > * @vm: The VM to suspend the mappings for > > > > + * @evict_all: Evict all VMAs > > > > * > > > > * Suspend the memory mappings for all objects mapped to HW via the GGTT or a > > > > * DPT page table. > > > > */ > > > > -void i915_ggtt_suspend_vm(struct i915_address_space *vm) > > > > +void i915_ggtt_suspend_vm(struct i915_address_space *vm, bool evict_all) > > > > { > > > > struct i915_vma *vma, *vn; > > > > int save_skip_rewrite; > > > > @@ -157,7 +158,7 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm) > > > > goto retry; > > > > } > > > > > > > > - if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { > > > > + if (evict_all || !i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { > > > > > > I don't fully understand this part. Why can we safely assume we can do > > > __i915_vma_evict(), shouldn't we want to __i915_vma_unbind() in the > > > case where it was bound? Because of the unconditional evict_all we > > > might be unbinding a bound vma, no? Is that safe? Please forgive my > > > ignorance if this question doesn't make sense. > > > > It looked to me like __i915_vma_unbind() pretty much just calls > > __i915_vma_evict() anyway, and the sync stuff shouldn't matter > > here. > > > > Hmm, I suppose there is that vma->node handling that might screw > > us over somehow. I'll need to check what that actually does. > > Ah, we do drm_mm_remove_node(&vma->node) manually anyway here. > That explains why it doesn't blow up later in vma_insert(). > So yeah, this kinda just looks like a hand rolled vma_unbind() > more or less. Why it doesn't just call the whole thing I > don't know. Okay, thanks for looking. One final thing, it seems the prior fix: 43e2b37e2ab6 "drm/i915/dpt: Make DPT object unshrinkable", which we agree was not correct, was CCed stable. Should we also do: Fixes: 43e2b37e2ab6 "drm/i915/dpt: Make DPT object unshrinkable" Cc: stable@xxxxxxxxxxxxxxx On this series? > > -- > Ville Syrjälä > Intel