On Thu, Sep 08, 2022 at 01:23:50PM +0100, Tvrtko Ursulin wrote: > > On 06/09/2022 18:46, Ville Syrjala wrote: > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > On gen3 the selftests are pretty much always tripping this: > > <4> [383.822424] pci 0000:00:02.0: drm_WARN_ON(dev_priv->mm.shrink_count) > > <4> [383.822546] WARNING: CPU: 2 PID: 3560 at drivers/gpu/drm/i915/i915_gem.c:1223 i915_gem_cleanup_early+0x96/0xb0 [i915] > > > > Looks to be due to the status page object lingering on the > > purge_list. Call synchronize_rcu() ahead of it to make more > > sure all objects have been freed. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 0f49ec9d494a..5b61f7ad6473 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -1098,6 +1098,7 @@ void i915_gem_drain_freed_objects(struct drm_i915_private *i915) > > flush_delayed_work(&i915->bdev.wq); > > rcu_barrier(); > > } > > + synchronize_rcu(); > > Looks a bit suspicious that a loop would not free all but one last rcu > grace would. Definitely fixes the issue in your testing? Definite is a bit hard to say with fuzzy stuff like this. But yes, so far didn't see the warn triggering anymore. CI results show the same. > > Perhaps the fact there is a cond_resched in __i915_gem_free_objects, but > then again free count should reflect the state and keep it looping in here.. > > Regards, > > Tvrtko -- Ville Syrjälä Intel