On Tue, Jul 01, 2014 at 11:17:35AM -0700, Ben Widawsky wrote: > Here be all the patches to make full PPGTT relatively stable on Broadwell. Most > of the work was actually to the generic PPGTT code, and not BDW specific. There > are basically 3 fixes: > 1. Make the error state not horrible, but more work still needed. > 2. Fix up another tricky from != from case in do_switch > 3. Generally more graceful handling in ppgtt_release > > 1. Various forms of this subseries have shown up on the list from both > myself, and Chris (and I feel like Mika, also). For whatever reason, > none have been merged. I don't necessarily mind missing information, but > without these patches we can OOPs and GP fault in the error state, which > is not acceptable. > > 2. The meat of the debugging came here. Essentially this problem has > already been seen, and solved. The issue is, there is another spot in > do_switch that can invoke a switch to the default context. We probably > want to get slightly better handling of this, but for now this solves my > problems. > > 3. There are 2 categories addressed in this bucket. Reset, and signals. The > patches themselves explain the situation. I take a two step approach with this > where first I make things correct (and slow as heck), and then I do the more > optimal and trickier solution. Both of these are in line to be replaced by > Daniel, but I needed something sooner. > > Here is the test case that caught all of the above issues: > while [ 1 ] ; do > (glxgears) & pid[0]=$! > (glxgears) & pid[1]=$! > (glxgears) & pid[2]=$! > sleep 3 > kill ${pid[*]} > done > > > Ben Widawsky (16): > drm/i915: Split up do_switch > drm/i915: Extract l3 remapping out of ctx switch > drm/i915/ppgtt: Load address space after mi_set_context > drm/i915: Fix another another use-after-free in do_switch > drm/i915/ctx: Return earlier on failure > drm/i915/error: Check the potential ctx obj's vm > drm/i915/error: vma error capture prettyify > drm/i915/error: Do a better job of disambiguating VMAs > drm/i915/error: Capture vmas instead of BOs > drm/i915: Add some extra guards in evict_vm > drm/i915: Make an uninterruptible evict > drm/i915: Reorder ctx unref on ppgtt cleanup > drm/i915: More correct (slower) ppgtt cleanup > drm/i915: Defer PPGTT cleanup > drm/i915/bdw: Enable full PPGTT > drm/i915: Get the error state over the wire (HACKish) I think we can pull the roughly first 9 patches (already merged one of them). Just needs someone to do the in-depth review, and preferrably someone who's looking at the relevant jira ppgtt tasks we're tracking interannyl. For the changes affecting ppgtt_cleanup I've replied in a separate mail. -Daniel > > drivers/gpu/drm/i915/i915_debugfs.c | 2 +- > drivers/gpu/drm/i915/i915_drv.h | 18 ++- > drivers/gpu/drm/i915/i915_gem.c | 110 +++++++++++++ > drivers/gpu/drm/i915/i915_gem_context.c | 238 ++++++++++++++++++++++------- > drivers/gpu/drm/i915/i915_gem_evict.c | 39 +++-- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- > drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +- > drivers/gpu/drm/i915/i915_gem_gtt.h | 4 + > drivers/gpu/drm/i915/i915_gpu_error.c | 157 ++++++++++++------- > drivers/gpu/drm/i915/i915_sysfs.c | 2 +- > 10 files changed, 449 insertions(+), 126 deletions(-) > > -- > 2.0.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx