On 18/03/2019 10:56, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-03-18 10:46:28)
On 18/03/2019 10:37, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
On 18/03/2019 09:51, Chris Wilson wrote:
As the final request on a ring may hold the reference to this ring (via
retiring the last pinned context), we may find ourselves chasing a
dangling pointer on completion of the list.
A quick solution is to hold a reference to the ring itself as we retire
along it so that we only free it after we stop dereferencing it.
Is there a guilty commit to reference as Fixes: ?
It only becomes a problem with veng as we gain an immediate free path,
whereas at the moment, context frees are deferred until they can acquire
the struct_mutex. We cannot hit this path at the moment, but that we had
to use the safe iterator implies that we were aware that the ring itself
could disappear. If you wanted to pin it on something,
References: b887d6154624 ("drm/i915: Retire requests along rings")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
drivers/gpu/drm/i915/i915_request.c | 6 +++++-
drivers/gpu/drm/i915/intel_engine_types.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 4 ++--
drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++----
drivers/gpu/drm/i915/intel_ringbuffer.h | 13 ++++++++++++-
drivers/gpu/drm/i915/selftests/mock_engine.c | 1 +
6 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9533a85cb0b3..0a3d94517d0a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
if (!i915->gt.active_requests)
return;
- list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
+ list_for_each_entry_safe(ring, tmp,
+ &i915->gt.active_rings, active_link) {
+ intel_ring_get(ring); /* last rq holds reference! */
ring_retire_requests(ring);
+ intel_ring_put(ring);
+ }
Where does it chase a dangling pointer? It used the safe iterator already.
Inside ring_retire_requests(); the use of _safe here actually implies we
met this problem already :)
I get it, the issue is during ring->request_list iteration in
ring_retire_requests. How about move ring pinning in there so it is clearer?
It's only this path that is affected. That maye becomes clearer when we
retire along timelines and not rings, which is the patchset I wanted to
land to fix this, but figured this was a tiny fix for now.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
I failed to find a short and clear suggestion for making the comment
more precise.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx