Re: [PATCH] drm/i915: Kick rcu harder to free objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 08/09/2022 15:32, Das, Nirmoy wrote:
Hi Ville,


I fixed a similar issue in DII but I couldn't reproduce it in drm

http://intel-gfx-pw.fi.intel.com/patch/228850/?series=15910&rev=2.

I wonder if that fixes the problem you are facing then I can send that to drm.

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7809be3a6840..5438e9277924 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1213,7 +1213,7 @@  void i915_gem_init_early(struct drm_i915_private *dev_priv)

  void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
  {
-    i915_gem_drain_freed_objects(dev_priv);
+    i915_gem_drain_workqueue(dev_priv);
      GEM_BUG_ON(!llist_empty(&dev_priv->mm.free_list));
      GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count));
      drm_WARN_ON(&dev_priv->drm, dev_priv->mm.shrink_count);

Yes why not, more black magic (count to three) but if it works... :) I also spy the general area has been a bit neglected. Like:

i915_gem_driver_remove:
...
  i915_gem_drain_workqueue
  i915_gem_drain_freed_objects

While i915_gem_drain_workqueue:
...
  i915_gem_drain_freed_objects

So i915_gem_drain_freed_objects in i915_gem_driver_remove is redundant already.

Should i915_gem_drain_freed_objects be unexported and all callers made just call i915_gem_drain_workqueue after your patch? Or if "drain free objects" is considered more self descriptive it could be made as an alias to i915_gem_drain_workqueue.

Regards,

Tvrtko



Regards,

Nirmoy

On 9/6/2022 7:46 PM, Ville Syrjala wrote:
From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>

On gen3 the selftests are pretty much always tripping this:
<4> [383.822424] pci 0000:00:02.0: drm_WARN_ON(dev_priv->mm.shrink_count)
<4> [383.822546] WARNING: CPU: 2 PID: 3560 at drivers/gpu/drm/i915/i915_gem.c:1223 i915_gem_cleanup_early+0x96/0xb0 [i915]

Looks to be due to the status page object lingering on the
purge_list. Call synchronize_rcu() ahead of it to make more
sure all objects have been freed.

Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gem.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0f49ec9d494a..5b61f7ad6473 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1098,6 +1098,7 @@ void i915_gem_drain_freed_objects(struct drm_i915_private *i915)
          flush_delayed_work(&i915->bdev.wq);
          rcu_barrier();
      }
+    synchronize_rcu();
  }
  /*



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux