Re: [PATCH 5/5] drm/i915: s/seqno/request/ tracking inside objects

John Harrison <John.C.Harrison@xxxxxxxxx> · Tue, 02 Sep 2014 11:06:29 +0100

Hello,

Is this patch going to be split up into more manageable pieces? I tried 
to apply it to a tree fetched yesterday and got a very large number of 
conflicts. I don't know whether that is because more execlist patches 
have been merged or if it is other random changes that have broken it or 
if I am just missing earlier patches in the set.

The patch has been sent with subjects of '[PATCH]', '[PATCH 5/5]' and 
'[PATCH 3/3]'. However, all three emails seem to be the same humongous 
single part patch and I can't find any 0/3, 4/5, etc. emails. Am I 
missing some prep work patches without which the final monster patch is 
never going to apply?

Thanks,
John.

On 27/08/2014 11:39, Chris Wilson wrote:
On Wed, Aug 27, 2014 at 11:55:34AM +0200, Daniel Vetter wrote:
On Tue, Aug 12, 2014 at 08:05:51PM +0100, Chris Wilson wrote:
At the heart of this change is that the seqno is a too low level of an
abstraction to handle the growing complexities of command tracking, both
with the introduction of multiple command queues with execbuffer and the
potential for reordering with a scheduler. On top of the seqno we have
the request. Conceptually this is just a fence, but it also has
substantial bookkeeping of its own in order to track the context and
batch in flight, for example. It is the central structure upon which we
can extend with dependency tracking et al.

As regards the objects, they were using the seqno as a simple fence,
upon which is check or even wait upon for command completion. This patch
exchanges that seqno/ring pair with the request itself. For the
majority, lifetime of the request is ordered by how we retire objects
then requests. However, both the unlocked waits and probing elsewhere do
not tie into the normal request lifetimes and so we need to introduce a
kref. Extending the objects to use the request as the fence naturally
extends to segregrating read/write fence tracking. This has significance
for it reduces the number of semaphores we need to emit, reducing the
likelihood of #54226, and improving performance overall.

v2: Rebase and split out the othogonal tweaks.

A silly happened with this patch. It seemed to nullify our earlier
seqno-vs-interrupt w/a. I could not spot why, but gen6+ started to fail
with missed interrupts (a good test of our robustness handling). So I
ripped out the existing ACTHD read and replaced it with a RING_HEAD to
manually check whether the request is complete. That also had the nice
consequence of forcing __wait_request() to being the central arbiter of
request completion.

The keener eyed reviewr will also spot that the reset_counter is moved
into the request simplifing __wait_request() callsites and reducing the
number of atomic reads by virtue of moving the check for a pending GPU
reset to the endpoints of GPU access.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: Oscar Mateo <oscar.mateo@xxxxxxxxx>
Cc: Brad Volkin <bradley.d.volkin@xxxxxxxxx>
Cc: "Kukanova, Svetlana" <svetlana.kukanova@xxxxxxxxx>
So I've tried to split this up and totally failed. Non-complete list of
things I didn't manage to untangle:

- The mmio flip refactoring.
Yeah, that's a fairly instrumental part of the patch. It's not
complicated but it does benefit a lot from using requests to both make
the decision cleaner and the tracking correct.

- The overlay request tracking refactoring.
Again, the api changes allow the code to be compacted.

- The switch to multiple parallel readers with the resulting cascading
   changes all over.
I thought that was fairly isolated to the gem object. It's glossed over
in errors/debugfs for simplicity, which deserves to be fixed given a
compact representation of all the requests, and so would kill the icky
code to find the "last" request.

- The missed irq w/a prep changes. It's easy to split out the change to
   re-add the rc6 reference and to ditch the ACT_HEAD read, but the commit
   message talks about instead reading the RING_HEAD, and I just didn't
   spot the changes relevant to that in this big diff. Was probably looking
   in the wrong place.
I did mention that I tried that earlier on on the ml, but missed saying
that it the forcewake reference didn't unbreak the old w/a in the
changelog.

- The move_to_active/retire refactoring. There's a pile of code movement
   in there, but I couldn't spot really what's just refactoring and what is
   real changed needed for the s/seqno/request/ change.
move-to() everything since that now operates on the request. retire(), not
a lot changes there, just the extra requests being tracked and the strict
lifetime ordering of the reference the object holds onto the requests.

- De-duping some of the logical_ring_ functions. Spotted because it
   conflicted (but was easy to hack around), still this shouldn't really be
   part of this.
Things I've spotted which could be split out but amount to a decent
rewrite of the patch:
- Getting at the ring of the last write to an object. Although I guess
   without the multi-reader stuff and the pageflip refactoring that would
   pretty much disappear.
Why? Who uses the ring of the last_write request? We compare engines in
pageflip but that's about it.

- Probably similar helpers for seqno if we don't switch to parallel writes
   in the same patch.

Splitting out the renames was easy, but that reduced the diff by less than
5% in size. So didn't help in reviewing the patch at all.
The actual rename patch is larger than this one (v2).
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx