On Fri, Mar 06, 2015 at 05:40:50PM +0000, Dave Gordon wrote: > On 06/03/15 15:57, Daniel Vetter wrote: > > On Fri, Mar 06, 2015 at 11:38:44AM +0000, John Harrison wrote: > >> On 05/03/2015 16:14, Daniel Vetter wrote: > >>> On Thu, Mar 05, 2015 at 03:06:42PM +0000, John Harrison wrote: > >>>> On 05/03/2015 14:44, Daniel Vetter wrote: > >>>>> Imo reserving a bit of ring space for each add_request should be solid. > >>>>> Userspace uses the exact same reservation logic for adding end-of-batch > >>>>> workarounds. The only thing needed to make this solid is to WARN if > >>>>> add_request ends up using more ring space than what we've reserved (not > >>>>> just when it actually runs out, that obviously doesn't happen often > >>>>> enough for testing). > >>>> The problem is that there could be multiple requests being processed in > >>>> parallel. This is especially true with the scheduler. Userland could submit > >>>> a whole stream of batches that all get queued up in the scheduler. Only > >>>> later do they get submitted to the hardware. The request must be allocated > >>>> up front because there is no other means of tracking them. But reserving > >>>> space at that point won't work because you either end up reserving massive > >>>> amounts of space if the reserve is cumulative, or not enough if only one > >>>> slot is reserved. > >>> At least with execlist we don't have that problem really since writing the > >>> ringbuffer is done synchronously and directly. > >>> > >>> For the legacy scheduler I expect that we won't do any of the ringbuf > >>> writes directly and instead that's all done by the scheduler > >>> asynchronously. > >>> > >>> So this should just be an issue while we are converting to the scheduler > >>> or on platforms that will never have one. And imo the request ringbuf > >>> reservation is the solution with the simplest impact on the design. > >> I don't understand what you mean here. The scheduler decouples the > >> submission of a batch buffer to the driver with the submission of that batch > >> buffer to the hardware. The submission code path itself is identical. The > >> scheduler does not do any hardware writes - it merely adds an extra layer > >> between the user interface code and the hardware access code. The request > >> creation must be done at driver submission time. The hardware submission > >> could be an arbitrary amount of time later. In the intervening time, any > >> number of new batch buffers may be submitted to the driver and thus new > >> requests be created. This is true for both legacy and execlist mode. > >> > >> The space reserve would have to be done at the start of the hardware > >> submission process not at driver submission time. I.e. it would need to be a > >> separate and new i915_gem_request_reserve_space() call at the start of the > >> back half of the exec buffer code. It could not be done during request > >> creation. > > > > Yeah with the scheduler we need to take out the reservation again since it > > doesn't make a lot of sense - the scheduler is the safety net that > > guarantees execution (barring gpu death ofc as always) when the execbuf > > ioctl returns 0. Maybe in the end we need a special alloc_request function > > for execbuf to only reserve when the scheduler isn't around. > > > > That additional work is just part of the prize we have to pay to never > > break things while refactoring code. And since we'll get the execlist > > scheduler in first that reservation code will probably stick around for a > > while (or maybe forever even if we decide against upstreaming the gen7 > > scheduler). > > > > Does this make sense? > > > > Thanks, Daniel > > I already put a pre-reservation in the preemptive version of the > scheduler. It won't start to write anything in the ringbuffer unless > it's guaranteed there's enough space to write the whole set of commands > to run a batch, from preamble to finalisation. That means that none of > the ring_begin() commands can fail, and we won't leave a partial set of > commands in the ring to confuse any subsequent (successful) submissions. > > And yes, I added warnings if the commands emitted to the ring overran > the amount of reserved space, so that developers would know if they'd > increased the total ringspace requirement. > > If there are any other failure modes while writing to the ring, we > should simply back up to the start of the sequence (i.e. rewind the > software copy of TAIL to where it was after we checked for space -- easy > because we haven't updated h/w TAIL yet), and we can leave the request > queued for subsequent retry (or fail it as though it had been submitted > and then killed by TDR, if the failure mode suggests it's not retryable). Sounds good. > [aside] One version of this was posted as > http://lists.freedesktop.org/archives/intel-gfx/2014-December/057247.html but > that re-used intel_ring_begin() as a reserve-space call, whereas another > version has a separate nonblocking function intel_ring_test_space() for > prechecking available space [/aside] I prefer the explicit reservation over allowing nesting. Allowing nesting gives us a lot more rope than we need I think, so much room for "creative abuse" (i.e. really hard to understand bugs). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx