> -----Original Message----- > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > Sent: 23 November 2024 04:18 > To: Srinivas, Vidya <vidya.srinivas@xxxxxxxxx> > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Syrjala, Ville <ville.syrjala@xxxxxxxxx>; > bgeffon@xxxxxxxxxx; Lee, Shawn C <shawn.c.lee@xxxxxxxxx> > Subject: Re: [PATCH] drm/i915/dpt: Restrict shrinker to DPT objects not > mapped > > On Fri, Nov 22, 2024 at 04:37:41PM +0530, Vidya Srinivas wrote: > > Restricting all DPT objects as unshrinkable was causing some > > chromebooks to run out of memory causing DMA remap failures. Thanks to > > Brian Geffon for the pointers on debug and suggesting usage of > > !obj->mm.mapping > > > > Fixes: 43e2b37e2ab6 ("drm/i915/dpt: Make DPT object unshrinkable") > > > > Credits-to: Brian Geffon <bgeffon@xxxxxxxxxx> > > Suggested-by: Ville Syrj_l_ <ville.syrjala@xxxxxxxxxxxxxxx> > > Signed-off-by: Vidya Srinivas <vidya.srinivas@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > index 3dc61cbd2e11..b155f0139d4e 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > > @@ -285,7 +285,7 @@ i915_gem_object_is_shrinkable(const struct > > drm_i915_gem_object *obj) { > > /* TODO: make DPT shrinkable when it has no bound vmas */ > > return i915_gem_object_type_has(obj, > I915_GEM_OBJECT_IS_SHRINKABLE) && > > - !obj->is_dpt; > > + !(obj->is_dpt && obj->mm.mapping); > > AFAICS the mapping could only become NULL if the DPT got evicted from the > GGTT, but if that can happen then I think the current code must still be busted > wrt. suspend/resume even with the unshrinkable DPT obj. > > Looking at the vma suspend stuff I think the only way something could still be > bound in the DPT during resume is if it was pinned during suspend. But we > should be unpinning all FBs during suspend, so that's a bit weird in itself. Hmm, > we do the unpinning from the cleanup_work so maybe that's still pending > when we suspend and thus something is still pinned in the DPT. And indeed > there is no flush_work()/etc. for that. So perhaps if we add that we could just > revert the unshrinkable DPT patch. > Hello Ville, Thank you so much. Yes, Google team also insist on that as they worry the DPT patch might just be a workaround for the real memory corruption. > Did we have a good way to reproduce the resume explosion? If not, I suppose > we could try to confirm the lack of flush_work() as the culprit by simply sticking > some kind of sleep() into the cleanup function to make sure it doesn't get a > chance to run during suspend. We can reproduce the original issue where during the suspend/resume, the meteorlake chromebook goes for a reboot and we have the crash captured from console-ramoops. > > I also had this: > https://patchwork.freedesktop.org/patch/503398/?series=108669&rev=1 > which would be good to have so that we can actully do the obvious > flush_work(cleanup_work) instead of playing confusing tricks with the the > commit_work. > Thank you so much. Will try this patch as well and update. Regards Vidya > > } > > > > static inline bool > > -- > > 2.34.1 > > -- > Ville Syrjälä > Intel