Re: [PATCH v3] drm/i915: Do not call API requiring struct_mutex where it is not available

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 13 Jan 2016 19:34:54 +0000

On Wed, Jan 13, 2016 at 04:16:21PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> 
> LRC code was calling GEM API like i915_gem_obj_ggtt_offset from
> places where the struct_mutex cannot be grabbed (irq handlers).
> 
> To avoid that this patch caches some interesting bits and values
> in the engine and context structures.
> 
> Some usages are also removed where they are not needed like a
> few asserts which are either impossible or have been checked
> already during engine initialization.
> 
> Side benefit is also that interrupt handlers and command
> submission stop evaluating invariant conditionals, like what
> Gen we are running on, on every interrupt and every command
> submitted.
> 
> This patch deals with logical ring context id and descriptors
> while subsequent patches will deal with the remaining issues.
> 
> v2:
>  * Cache the VMA instead of the address. (Chris Wilson)
>  * Incorporate Dave Gordon's good comments and function name.
> 
> v3:
>  * Extract ctx descriptor template to a function and group
>    functions dealing with ctx descriptor & co together near
>    top of the file. (Dave Gordon)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> Cc: Dave Gordon <david.s.gordon@xxxxxxxxx>

Considering the current state of affairs, this is not bad. I have a
different shade of slightly-off-white that I like but that applies to a
world where execlists is more request focused, e.g.:

static u32 execlists_request_write_tail(struct drm_i915_gem_request *req)
{
        struct intel_ring *ring = req->ring;
        struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;

        if (ppgtt && !USES_FULL_48BIT_PPGTT(req->i915)) {
                /* True 32b PPGTT with dynamic page allocation: update PDP
                 * registers and point the unallocated PDPs to scratch page.
                 * PML4 is allocated during ppgtt init, so this is not needed
                 * in 48-bit mode.
                 */
                if (ppgtt->pd_dirty_rings & intel_engine_flag(req->engine)) {
                        ASSIGN_CTX_PDP(ppgtt, ring->registers, 3);
                        ASSIGN_CTX_PDP(ppgtt, ring->registers, 2);
                        ASSIGN_CTX_PDP(ppgtt, ring->registers, 1);
                        ASSIGN_CTX_PDP(ppgtt, ring->registers, 0);
                        ppgtt->pd_dirty_rings &= ~intel_engine_flag(req->engine);
                }
        }

        ring->registers[CTX_RING_TAIL+1] = req->tail;
        return ring->context_descriptor;
}

where the req->ring is much more readily available than
	struct intel_context_engine *ce = &req->ctx->engine[req->engine->id];
though a req->context_engine would shut me up entirely. 

> +/**
> + * intel_execlists_ctx_id() - get the Execlists Context ID
> + * @ctx: Context to get the ID for
> + * @ring: Engine to get the ID for
> + *
> + * Do not confuse with ctx->id! Unfortunately we have a name overload
> + * here: the old context ID we pass to userspace as a handler so that
> + * they can refer to a context, and the new context ID we pass to the
> + * ELSP so that the GPU can inform us of the context status via
> + * interrupts.
> + *
> + * The context ID is a portion of the context descriptor, so we can
> + * just extract the required part from the cached descriptor.
> + *
> + * Return: 20-bits globally unique context ID.
> + */
> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
> +			   struct intel_engine_cs *ring)

I would suggest intel_execlists_status_tag (or at least offer it as a
starting point for a non-conflicting name).

> +{
> +	return intel_lr_context_descriptor(ctx, ring) >> GEN8_CTX_ID_SHIFT;
>  }

Next up on the hit list is struct intel_context_engine!

> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 7349d9258191..85ce2272f92c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -269,6 +269,8 @@ struct  intel_engine_cs {
>  	struct list_head execlist_queue;
>  	struct list_head execlist_retired_req_list;
>  	u8 next_context_status_buffer;
> +	bool disable_lite_restore_wa;

The holes, the holes!

> +	u32 ctx_desc_template;

I would suggest this is better named something like
execlist_context_base_descriptor since we already have been using the
execlist_ prefix to distinguish lrc specific state, and being verbose
doesn't really hurt when we only use it twice.

Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx