On 25/02/2015 21:15, Daniel Vetter wrote:
On Wed, Feb 18, 2015 at 03:27:38PM +0000, John Harrison wrote:
On 13/02/2015 12:15, Chris Wilson wrote:
On Fri, Feb 13, 2015 at 11:48:33AM +0000, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>
In execlist mode, context initialisation is deferred until first use of the
given context. This is because execlist mode has many more contexts than legacy
mode and many are never actually used.
That's not correct. There are no more contexts in execlists than legacy.
There are more ringbuffers, or rather the contexts have an extra state
object associated with them.
Okay, I should have said sub-contexts. Or context state objects. Or
something.
per-engine ctx state? Naming stuff is hard ;-)
Previously, the initialisation commands
were written to the ring and tagged with some random request structure via the
OLR. This seemed to be causing a null pointer deference bug under certain
circumstances (BZ:40112).
This patch adds explicit request creation and submission to the deferred
initialisation code path. Thus removing any reliance on or randomness caused by
the OLR.
This is upside down though. The request should be referencing the
context (thus instantiating it on demand) and nothing in the context
allocation requires the request. The initialisation here should be during
i915_request_switch_context(), since it can be entirely shared with
legacy.
-Chris
The request does reference the context - the alloc_reques() function takes a
context object as a parameter. Thus it is impossible for the request to be
used/supplied/required during context creation. The issue here is the lazy
initialisation of the per ring context state which requires sending commands
to the ring on first usage of the given context object on the given ring.
One problem is that the initialisation request and the batch buffer request
cannot be merged at the moment. They both use request->batch_obj for
tracking the command object. Thus this patch only works due to the deferred
intialisation occurring during the i915_gem_validate_context() call very
early on in execbuffer() rather than as part of the context switch within
the batch buffer execution which is much later.
My request struct doesn't have a batch_obj pointer. Where is that from and
why do we need it? Atm just chasing Chris' comments, haven't read the full
series yet.
It should do! It was added way back in June 2013 by Mika in 'drm/i915:
add batch bo to i915_add_request()'. So unless someone has removed it
again since I last fetched a tree, you should definitely have it.
I'm not sure what you mean by i915_request_switch_context(). The existing
i915_switch_context() does now take just a request structure rather than a
ring/ringbuf/context mixture. However, it is not really a good idea to do
the context switch automatically as part of creating the request. The
request creation and request execution could be quite separated in time,
especially with a scheduler.
It should be possible to move the deferred initialisation within the context
switch if the object tracking can be resolved. Thus they could share the
same request and there would not be effectively two separate execution calls
at the hardware level. Again, that's potentially work that could be done as
a follow up task of improving the context management independent of the
current task of removing the OLR.
I think the biggest risk with adding a separate request for the lrc
deferred init is in accidentally nesting request when someone moves around
the lrc validation. Atm it's at the top of execbuf but we tend to shuffle
things around a lot.
Is there some simple WARN_ON we could smash into the alloc function to
make sure this never happens? ring->olr would be it, but since we want to
kill that that's not great. Or do I see risks which aren't really there?
-Daniel
There is a WARN_ON that the batch object in the request structure does
not get overwritten. Although, that could only happen if the lazy setup
and the batch buffer execution were sharing the same request. Nesting
multiple requests shouldn't really be a problem. Without the OLR, there
is no globally shared state. Having multiple requests being created in
parallel is fine. The lazy context setup must be completed before the
execbuffer is executed otherwise Bad Things are going to happen
irrespective of request usage. So even if the execbuffer request is
created first, the lazy setup will still happen at the right time
without breaking the execbuffer's request.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx