The problem is we're going to switch to a new context, which could be the default context. The plan was to use restore inhibit, which would be fine, except if we are using dynamic page tables (which we will). If we use dynamic page tables and we don't load new page tables, the previous page tables might go away, and future operations will fault. CTXA runs. switch to default, restore inhibit CTXA dies and has its address space taken away. Run CTXB, tries to save using the context A's address space - this fails. The general solution is to make sure every context has it's own state, and its own address space. For cases when we must restore inhibit, first thing we do is load a valid address space. I thought this would be enough, but apparently there are references within the context itself which will refer to the old address space - therefore, we also must reinitialize. It was tricky to track this down as we don't have much insight into what happens in a context save. This is required for the next patch which enables dynamic page tables. I need to go back and test this pre-GEN8 Signed-off-by: Ben Widawsky <ben@xxxxxxxxxxxx> --- drivers/gpu/drm/i915/i915_gem_context.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index dade4e2..59c5173 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -807,9 +807,13 @@ static int do_switch_rcs(struct intel_engine_cs *ring, i915_gem_vma_bind(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND); } - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) + if (!to->legacy_hw_ctx.initialized) { hw_flags |= MI_RESTORE_INHIBIT; - else if (test_and_clear_bit(ring->id, &to->vm->pd_reload_mask)) + /* NB: If we inhibit the restore, the context is not allowed to + * die because future work may end up depending on valid address + * space. This means we must enforce that a page table load + * occur when this occurs. */ + } else if (test_and_clear_bit(ring->id, &to->vm->pd_reload_mask)) hw_flags |= MI_FORCE_RESTORE; ret = mi_set_context(ring, to, hw_flags); @@ -823,12 +827,11 @@ static int do_switch_rcs(struct intel_engine_cs *ring, /* GEN8 does *not* require an explicit reload if the PDPs have been * setup, and we do not wish to move them. - * - * XXX: If we implemented page directory eviction code, this - * optimization needs to be removed. */ - if (IS_GEN8(ring->dev) && needs_pd_load(ring, to)) { - /* Doing a PD load always reloads the page dirs */ + if (IS_GEN8(ring->dev) && (hw_flags & MI_RESTORE_INHIBIT)) { + /* We have a valid page directory (scratch) to switch to. This + * allows the old VM to be freed. Note that if anything occurs + * between the set context, and here, we are f*cked */ ret = ppgtt->switch_mm(ppgtt, ring, false); /* The hardware context switch is emitted, but we haven't * actually changed the state - so it's probably safe to bail @@ -863,7 +866,7 @@ static int do_switch_rcs(struct intel_engine_cs *ring, BUG_ON(from->legacy_hw_ctx.rcs_state->ring != ring); } - uninitialized = !to->legacy_hw_ctx.initialized && from == NULL; + uninitialized = !to->legacy_hw_ctx.initialized; to->legacy_hw_ctx.initialized = true; /* From may have disappeared again after the context unref */ from = do_switch_fini_common(ring, from, to); -- 2.0.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx