On Fri, Feb 13, 2015 at 10:58 AM, Nick Hoath <nicholas.hoath@xxxxxxxxx> wrote: > On 13/02/2015 09:32, Daniel Vetter wrote: >> >> On Thu, Feb 12, 2015 at 12:29:21PM +0000, Nick Hoath wrote: >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 >>> >>> Signed-off-by: Nick Hoath <nicholas.hoath@xxxxxxxxx> >> >> >> Commit message is missing the absolutely crucial detail about which patch >> introduced this regression: >> >> commit 6d3d8274bc45de4babb62d64562d92af984dd238 >> Author: Nick Hoath <nicholas.hoath@xxxxxxxxx> >> AuthorDate: Thu Jan 15 13:10:39 2015 +0000 >> >> drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request >> >> Another thing I've noticed is that we explicitly drop the context >> reference for the request before dropping the request reference. Without >> clearing the req->ctx pointer. That has a very high chance to leading to >> tears, imo the context unreferenceing should be pushed into >> i915_gem_request_free. >> >> Except that it's there already, which means we have a double unref now? > > > Looking at the code, it looks like that's the case. > >> >> Also this patch is for the legacy ringbuffer code, but the referenced bug >> is for gen8+ execlists. We're definitely not running this code here I >> think. > > > i915_gem_reset_ring_cleanup is used in execlists in the hang recovery case. Oops I've missed that, somehow I've thought intel_lrc.c has it's own reset recovery code. But I've just mixed up function names a bit. >> Imo step one is to drop all the explicit ctx refcounting for req->ctx and >> always rely on the implicit reference. Then see what happens. > > > I agree that the refcounting needs re-evaluating after the merge of execlist > queue entries & requests, however I think the cleanup of the double > unref/removing the refcounting should be done in another patchset. This > patch is purely to fix the issue raised in 88652. Depends on the relative > priorities. Well this patch here won't work because there's now a double unref. And I spotted that one because intel_execlists_retire_requests seems to have the exact same double unref already. Hence why I think we should fix up that first and then reasses what's left. The bug here (before your patch) is just a use-after-free, if there's some other reference to the request. And it will also be fixed with the redone req->ctx refcounting. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx