On Wed, Aug 27, 2014 at 11:55:34AM +0200, Daniel Vetter wrote: > On Tue, Aug 12, 2014 at 08:05:51PM +0100, Chris Wilson wrote: > > At the heart of this change is that the seqno is a too low level of an > > abstraction to handle the growing complexities of command tracking, both > > with the introduction of multiple command queues with execbuffer and the > > potential for reordering with a scheduler. On top of the seqno we have > > the request. Conceptually this is just a fence, but it also has > > substantial bookkeeping of its own in order to track the context and > > batch in flight, for example. It is the central structure upon which we > > can extend with dependency tracking et al. > > > > As regards the objects, they were using the seqno as a simple fence, > > upon which is check or even wait upon for command completion. This patch > > exchanges that seqno/ring pair with the request itself. For the > > majority, lifetime of the request is ordered by how we retire objects > > then requests. However, both the unlocked waits and probing elsewhere do > > not tie into the normal request lifetimes and so we need to introduce a > > kref. Extending the objects to use the request as the fence naturally > > extends to segregrating read/write fence tracking. This has significance > > for it reduces the number of semaphores we need to emit, reducing the > > likelihood of #54226, and improving performance overall. > > > > v2: Rebase and split out the othogonal tweaks. > > > > A silly happened with this patch. It seemed to nullify our earlier > > seqno-vs-interrupt w/a. I could not spot why, but gen6+ started to fail > > with missed interrupts (a good test of our robustness handling). So I > > ripped out the existing ACTHD read and replaced it with a RING_HEAD to > > manually check whether the request is complete. That also had the nice > > consequence of forcing __wait_request() to being the central arbiter of > > request completion. > > > > The keener eyed reviewr will also spot that the reset_counter is moved > > into the request simplifing __wait_request() callsites and reducing the > > number of atomic reads by virtue of moving the check for a pending GPU > > reset to the endpoints of GPU access. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > > Cc: Oscar Mateo <oscar.mateo@xxxxxxxxx> > > Cc: Brad Volkin <bradley.d.volkin@xxxxxxxxx> > > Cc: "Kukanova, Svetlana" <svetlana.kukanova@xxxxxxxxx> > > So I've tried to split this up and totally failed. Non-complete list of > things I didn't manage to untangle: > > - The mmio flip refactoring. Yeah, that's a fairly instrumental part of the patch. It's not complicated but it does benefit a lot from using requests to both make the decision cleaner and the tracking correct. > - The overlay request tracking refactoring. Again, the api changes allow the code to be compacted. > - The switch to multiple parallel readers with the resulting cascading > changes all over. I thought that was fairly isolated to the gem object. It's glossed over in errors/debugfs for simplicity, which deserves to be fixed given a compact representation of all the requests, and so would kill the icky code to find the "last" request. > - The missed irq w/a prep changes. It's easy to split out the change to > re-add the rc6 reference and to ditch the ACT_HEAD read, but the commit > message talks about instead reading the RING_HEAD, and I just didn't > spot the changes relevant to that in this big diff. Was probably looking > in the wrong place. I did mention that I tried that earlier on on the ml, but missed saying that it the forcewake reference didn't unbreak the old w/a in the changelog. > - The move_to_active/retire refactoring. There's a pile of code movement > in there, but I couldn't spot really what's just refactoring and what is > real changed needed for the s/seqno/request/ change. move-to() everything since that now operates on the request. retire(), not a lot changes there, just the extra requests being tracked and the strict lifetime ordering of the reference the object holds onto the requests. > - De-duping some of the logical_ring_ functions. Spotted because it > conflicted (but was easy to hack around), still this shouldn't really be > part of this. > Things I've spotted which could be split out but amount to a decent > rewrite of the patch: > - Getting at the ring of the last write to an object. Although I guess > without the multi-reader stuff and the pageflip refactoring that would > pretty much disappear. Why? Who uses the ring of the last_write request? We compare engines in pageflip but that's about it. > - Probably similar helpers for seqno if we don't switch to parallel writes > in the same patch. > > Splitting out the renames was easy, but that reduced the diff by less than > 5% in size. So didn't help in reviewing the patch at all. The actual rename patch is larger than this one (v2). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx