On Thu, Mar 19, 2015 at 10:17:42PM +0000, Chris Wilson wrote: > On Thu, Mar 19, 2015 at 06:37:28PM +0100, Daniel Vetter wrote: > > On Wed, Mar 18, 2015 at 06:19:22PM +0000, Chris Wilson wrote: > > > WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140() > > > WARN_ON(!list_empty(&vm->active_list)) > > > > How does this come about - we call gpu_idle before this seems to blow up, > > so all requests should be completed? > > Honestly, I couldn't figure it out either. I had an epiphany when I saw > that we could now have an empty request list but non-empty active list > added a test to detect when that happens and shouted eureka when the > WARN fired. I could trigger the WARN in evict_vm pretty reliably, but > not since this patch. It could just be masking another bug. Can you perhaps double-check the theory by putting a WARN_ON(list_empty(active_list) != list_empyt(request_list)) into gpu_idle? Ofc with this patch reverted so that the bug surfaces again. Really strange indeed. > > And I don't think we can blame this > > on racy seqno signalling, since gpu_idle does all the waiting already ... > > > > > Identified by updating WATCH_LISTS: > > > > > > [drm:i915_verify_lists] *ERROR* blitter ring: active list not empty, but no requests > > > WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230() > > > WARN_ON(i915_verify_lists(ring->dev)) > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> > > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > > > > Since we've just discussed this on irc: Doesn't this now enshrine that > > every bo needs to hold a full request? > > I'm not following. The bo hold a reference to requests, so we know we > can iterate the ring->request_list and the ring->active_list > independently. There is a challenge in doing the execbuf with as few > kref as possible, but there is also the question of whether this > particular function should go back to the previous behaviour of batching > the completion evaluation for all requests such that they are evaluated > consistently. One way without killing the abstraction entirely would be > to evaluate the i915_request_complete() only for the request_list and > then use the cached completion value for the active_list. Yeah I meant the kref batching the old scheme would have allowed. I guess better to figure this one out first completely before we dig into micro-optimizations again. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx