Quoting Tvrtko Ursulin (2019-03-18 10:46:28) > > On 18/03/2019 10:37, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2019-03-18 10:31:57) > >> > >> On 18/03/2019 09:51, Chris Wilson wrote: > >>> As the final request on a ring may hold the reference to this ring (via > >>> retiring the last pinned context), we may find ourselves chasing a > >>> dangling pointer on completion of the list. > >>> > >>> A quick solution is to hold a reference to the ring itself as we retire > >>> along it so that we only free it after we stop dereferencing it. > >> > >> Is there a guilty commit to reference as Fixes: ? > > > > It only becomes a problem with veng as we gain an immediate free path, > > whereas at the moment, context frees are deferred until they can acquire > > the struct_mutex. We cannot hit this path at the moment, but that we had > > to use the safe iterator implies that we were aware that the ring itself > > could disappear. If you wanted to pin it on something, > > > > References: b887d6154624 ("drm/i915: Retire requests along rings") > > > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/i915_request.c | 6 +++++- > >>> drivers/gpu/drm/i915/intel_engine_types.h | 2 ++ > >>> drivers/gpu/drm/i915/intel_lrc.c | 4 ++-- > >>> drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++---- > >>> drivers/gpu/drm/i915/intel_ringbuffer.h | 13 ++++++++++++- > >>> drivers/gpu/drm/i915/selftests/mock_engine.c | 1 + > >>> 6 files changed, 27 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >>> index 9533a85cb0b3..0a3d94517d0a 100644 > >>> --- a/drivers/gpu/drm/i915/i915_request.c > >>> +++ b/drivers/gpu/drm/i915/i915_request.c > >>> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915) > >>> if (!i915->gt.active_requests) > >>> return; > >>> > >>> - list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link) > >>> + list_for_each_entry_safe(ring, tmp, > >>> + &i915->gt.active_rings, active_link) { > >>> + intel_ring_get(ring); /* last rq holds reference! */ > >>> ring_retire_requests(ring); > >>> + intel_ring_put(ring); > >>> + } > >> > >> Where does it chase a dangling pointer? It used the safe iterator already. > > > > Inside ring_retire_requests(); the use of _safe here actually implies we > > met this problem already :) > > I get it, the issue is during ring->request_list iteration in > ring_retire_requests. How about move ring pinning in there so it is clearer? It's only this path that is affected. That maye becomes clearer when we retire along timelines and not rings, which is the patchset I wanted to land to fix this, but figured this was a tiny fix for now. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx