On 03/11/14 20:59, Chris Wilson wrote: > On Mon, Nov 03, 2014 at 01:29:04PM +0000, Dave Gordon wrote: >> Fixes to both the LRC and the legacy ringbuffer code to correctly >> calculate and update the available space in a ring. >> >> The logical ring code was updating the software ring 'head' value >> by reading the hardware 'HEAD' register. In LRC mode, this is not >> valid as the hardware is not necessarily executing the same context >> that is being processed by the software. Thus reading the h/w HEAD >> could put an unrelated (undefined, effectively random) value into >> the s/w 'head' -- A Bad Thing for the free space calculations. >> >> In addition, the old code could update a ringbuffer's 'head' value >> from the 'last_retired_head' even when the latter hadn't been recently >> updated and therefore had a value of -1; this would also confuse the >> freespace calculations. Now, we consume 'last_retired_head' in just >> one place, ensuring that this confusion does not arise. > > What? Not unless someone had broken it... > > This is the code I expect to see here based on requests: > > static int ring_wait(struct intel_ringbuffer *ring, int n) > { > int ret; > > trace_intel_ringbuffer_wait(ring, n); > > do { > struct i915_gem_request *rq; > > i915_gem_retire_requests__engine(ring->engine); > if (ring->retired_head != -1) { > ring->head = ring->retired_head; > ring->retired_head = -1; > > ring->space = intel_ring_space(ring); > if (ring->space >= n) > return 0; > } > > list_for_each_entry(rq, &ring->breadcrumbs, breadcrumb_link) > if (__intel_ring_space(rq->tail, ring->tail, > ring->size, I915_RING_RSVD) >= n) > break; > > if (WARN_ON(&rq->breadcrumb_link == &ring->breadcrumbs)) > return -EDEADLK; > > ret = i915_request_wait(rq); > } while (ret == 0); > > return ret; > } I like this sample code better than the version currently in the driver, because it, like my patch, has only one place where the value is transferred from 'retired_head' to 'head', and it's clearly guarded by a test that 'retired_head' is not -1. But this isn't actually the code we have, which has four places where this copying is done (two for LRC and two for legacy), and one in each path is not so guarded. So *that's* what I'm fixing. It looks like this code also assumes one request queue per ringbuffer, rather than one per engine, which may be nicer but isn't what we currently have. > and that works for both execlists and regular... > -Chris My calculate-freespace and update-freespace functions are likewise LRC-agnostic, although the wait_for_request/wait_for_space functions that they're called from have (regrettably) been duplicated. Not my call, though. Finally, my new code is more resilient against misuse than the current version. At present, if a sequence of emit()s overruns the allocated length (possibly due to display code injecting extra instruction without setting up its own ring_begin/add_request pair, or simply a bug), then the ring space calculation can in some cases claim that there's suddenly a LOT of space (due to the way the modulo arithmetic is done) and confusion will follow. My version will return "no space" rather than "a lot of space" in this situation. BTW, I have some local patches which enforce strict checking of ring_begin/add_request pairing and generates warnings if there's a mismatch or an overrun or any other misuse. We've been using these to help identify and eliminate code that just stuffs instructions into the ring without notifying the scheduler. Interested? .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx