Re: [PATCH v3 4/7] drm/i915: Cache LRC state page in the context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/01/16 16:45, Dave Gordon wrote:
On 12/01/16 13:11, Chris Wilson wrote:
On Tue, Jan 12, 2016 at 12:54:25PM +0000, Tvrtko Ursulin wrote:

On 12/01/16 12:12, Chris Wilson wrote:
On Tue, Jan 12, 2016 at 11:56:11AM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.

v2: Also cache the mapping. (Chris Wilson)
v3: Unmap on the error path.

Then we only use the lrc_state_page in the unmapping, hardly worth the
cost of saving it.

Ok.

Do you also know if this would now require any flushing or something
if previously kunmap_atomic might have done something under the
covers?

kmap_atomic only changes the PTE and the unmap would flush the TLB. In
terms of our pointer access, using kmap/kmap_atomic is equivalent. (Just
kmap_atomic is meant to be cheaper to set up and a limited resource
which can only be used without preemption.)

The reg_state is better associated with the ring (since it basically
contains the analog of the RING_HEAD and friends).

Hm, not sure. The page belongs to the object from that anonymous
struct in intel_context so I think it is best to keep them together.

The page does sure, but the state does not.

The result is that we get:

ring->registers[CTX_RING_BUFFER_STATE+1] = ring->vma->node.state;
ASSIGN_CTX_PDP(ppgtt, ring->registers, 3);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 2);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 1);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 0);
ring->registers[CTX_RING_TAIL+1] = req->tail;

which makes a lot more sense, to me, when viewed against the underlying
architecture of the hardware and comparing against the legacy ringbuffer.
-Chris

When you say "ring", do you mean "engine" or "ringbuffer"?

The register state can't be associated with the engine /per se/, because
it has to be per-context as well as per-engine. It doesn't really belong
with the ringbuffer; in particular I've seen code for allocating the
ringbuffer in stolen memory and dropping it during hibernation, whereas
this register state shouldn't be lost across hibernation. So the
lifetime of a register state image and a ringbuffer are different,
therefore they don't belong together.

The register state is pretty much the /definition/ of a context, in
hardware terms. OK, it's got an HWSP and other bits, but the register
save area is what the h/w really cares about. So it makes sense that the
kmapping for that area is also part of (the CPU's idea of) the context.
Yes,

     ctx.engine[engine_id].registers[regno] = ...

is a bit clumsy, but I'd expect to cache the register-image pointer in a
local here, along the lines of:

     uint32_t *registers = ctx.engine[engine_id].registers;
     registers[CTX_RING_TAIL+1] = req->tail;

etc.

[aside]
Some of this would be much less clumsy if the anonymous engine[]
struct had a name, say, "engine_info", so we could just do
     struct engine_info *eip = &ctx.engines[engine->id];
once at the beginning of any per-engine function and then use
eip->ringbuf, eip->state, eip->registers, etc without continually
repeating the 'ctx.' and 'engine->id' bits over and over ...
[/aside]

Apart from that, I think Tvrtko's patch has lost the dirtying from:

 > -    page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);

in execlists_update_context(), so should add

     set_page_dirty(lrc_state_page)

instead (and that's the use for it, so you /do/ want to cache it).

It is still there (i915_gem_object_get_dirty_page) but at the point of pinning, not each ctx update. Do you think that is a problem? I thought no one can flush and clear the dirty page while the ctx is pinned.

On the more general level, can we agree to let this fix with the pointer cached where it currently is (in the context) and leave the anonymous struct - which we all dislike - and I'd add it should probably be defined in intel_lrc.h - for a later cleanup?

If so I only need to respin with the struct page * caching removed.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux