On Wed, Feb 15, 2012 at 11:25:36AM +0000, Chris Wilson wrote: > By recording the location of every request in the ringbuffer, we know > that in order to retire the request the GPU must have finished reading > it and so the GPU head is now beyond the tail of the request. We can > therefore provide a conservative estimate of where the GPU is reading > from in order to avoid having to read back the ring buffer registers > when polling for space upon starting a new write into the ringbuffer. > > A secondary effect is that this allows us to convert > intel_ring_buffer_wait() to use i915_wait_request() and so consolidate > upon the single function to handle the complicated task of waiting upon > the GPU. A necessary precaution is that we need to make that wait > uninterruptible to match the existing conditions as all the callers of > intel_ring_begin() have not been audited to handle ERESTARTSYS > correctly. > > By using a conservative estimate for the head, and always processing all > outstanding requests first, we prevent a race condition between using > the estimate and direct reads of I915_RING_HEAD which could result in > the value of the head going backwards, and the tail overflowing once > again. We are also careful to mark any request that we skip over in > order to free space in ring as consumed which provides a > self-consistency check. > > Given sufficient abuse, such as a set of unthrottled GPU bound > cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on > Sandy Bridge (i5-2520m): > firefox-paintball 18927ms -> 15646ms: 1.21x speedup > firefox-fishtank 12563ms -> 11278ms: 1.11x speedup > which is a mild consolation for the performance those traces achieved from > exploiting the buggy autoreported head. > > v2: Add a few more comments and make request->tail a conservative > estimate as suggested by Daniel Vetter. > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> Thanks for these 3 patches, queued for next. I've had to resolve a little conflict in this one because - you've based these on a three without Ben's defer retirement patches - and I don't want to double-merge the autoreport_head removal patch to both -fixes and -next For next plans I think this is all for the current round, I plan to push out a new -next for testing in 1-2 days. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48