Quoting Tvrtko Ursulin (2018-07-19 13:14:38) > > On 19/07/2018 12:59, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-07-19 12:49:13) > >> > >> On 19/07/2018 08:50, Chris Wilson wrote: > >>> There's a race between idling the engine and finishing off the last > >>> tasklet (as we may kick the tasklets after declaring an individual > >>> engine idle). However, since we do not need to access the device until > >>> we try to submit to the ELSP register (processing the CSB just requires > >>> normal CPU access to the HWSP, and when idle we should not need to > >>> submit!) we can defer the assertion unto that point. The assertion is > >>> still useful as it does verify that we do hold the longterm GT wakeref > >>> taken from request allocation until request completion. > >>> > >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274 > >>> Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)") > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++-------------- > >>> 1 file changed, 11 insertions(+), 14 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >>> index db5351e6a3a5..9d693e61536c 100644 > >>> --- a/drivers/gpu/drm/i915/intel_lrc.c > >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c > >>> @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > >>> struct execlist_port *port = execlists->port; > >>> unsigned int n; > >>> > >>> + /* > >>> + * We can skip acquiring intel_runtime_pm_get() here as it was taken > >>> + * on our behalf by the request (see i915_gem_mark_busy()) and it will > >>> + * not be relinquished until the device is idle (see > >>> + * i915_gem_idle_work_handler()). As a precaution, we make sure > >>> + * that all ELSP are drained i.e. we have processed the CSB, > >>> + * before allowing ourselves to idle and calling intel_runtime_pm_put(). > >>> + */ > >>> + GEM_BUG_ON(!engine->i915->gt.awake); > >>> + > >>> /* > >>> * ELSQ note: the submit queue is not cleared after being submitted > >>> * to the HW so we need to make sure we always clean it up. This is > >>> @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) > >>> { > >>> lockdep_assert_held(&engine->timeline.lock); > >>> > >>> - /* > >>> - * We can skip acquiring intel_runtime_pm_get() here as it was taken > >>> - * on our behalf by the request (see i915_gem_mark_busy()) and it will > >>> - * not be relinquished until the device is idle (see > >>> - * i915_gem_idle_work_handler()). As a precaution, we make sure > >>> - * that all ELSP are drained i.e. we have processed the CSB, > >>> - * before allowing ourselves to idle and calling intel_runtime_pm_put(). > >>> - */ > >>> - GEM_BUG_ON(!engine->i915->gt.awake); > >>> - > >>> process_csb(engine); > >>> if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) > >>> execlists_dequeue(engine); > >>> @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data) > >>> engine->execlists.active); > >>> > >>> spin_lock_irqsave(&engine->timeline.lock, flags); > >>> - > >>> - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */ > >>> - __execlists_submission_tasklet(engine); > >>> - > >>> + __execlists_submission_tasklet(engine); > >>> spin_unlock_irqrestore(&engine->timeline.lock, flags); > >>> } > >>> > >>> > >> > >> Why we won't hit the assert on the elsp submit side now? AFAIR the > >> discussion about this particular line concluded that direct tasklet call > >> can race with busy->idle transition. So even is process_csb doens't need > >> the assert, that part I get, the part about the race now again bothers > >> me. Perhaps I just forgot what I thought I understood back then.. :( > > > > Same race, I just didn't think it through that it could change within > > the space of a couple of lines. :| > > > >> Should this call process_csb only if !gt.awake? But sounds terribly > >> dodgy.. Why would execlists.active be set if we are not awake.. > > > > Have to remember it's i915->gt.awake no execlists->active (that's what I > > briefly hoped for...) I looked at ways we might decouple the tasklet (we > > can't just use tasklet_disable) but that looks overkill, and I can't see > > any way we can guarantee that we won't randomly kick it later. > > > > I did consider reordering the final wait_for(engine_is_idle()) and > > tasklet_flush, but I couldn't convince myself that was enough to > > guarantee we wouldn't get an even later interrupt to kick the tasklet. > > > > So left it at just using the CSB to filter out the spurious call, and > > relying on that we are idle to avoid the submit. The assertion still > > plays the same role as it always has done, making sure the actual > > register access is covered by our wakeref. > > So the thing I am thinking of, and you tell me if it is not the one: > > 1. Idle work handler runs > 2. Goes to idle the engines > 3. New request comes in from a separate thread > 4. intel_engine_is_idle sees execlist.active is set > 5. Calls the tasklet > > But gt.awake is true at this point, so assert does not fire. > > If the status of execlists.active changes from true to false between the > check in 4 and tasklet kick in 5. Okay, but in this case how did we pass > the the gt.awake and gt.active_requests checks at the top of the idle > work handler? active_requests is protected by struct_mutex, so stable for idle_worker deciding to set gt.awake=false. The assert we have is all about gt.awake and the theory was that after parking the engine we wouldn't get another tasklet. That's the theory full of holes. > The whole state change has to happen in the middle of idle work handler > I guess. Transition from !awake to awake before the > intel_engines_are_idle_loop, and actually process the csb in the middle > of the intel_engine_is_idle checks. > > Okay, think I answered my own question. My mindset was to serialized in > this instance. :) Same answer, so yup. The challenge being getting that serialisation with the interrupt handler solid. Or just be prepared that we get the occasional late tasklet_schedule and let it nop away. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx