On Thu, Sep 04, 2014 at 06:25:03PM +0300, Ville Syrjälä wrote: > On Thu, Sep 04, 2014 at 04:09:02PM +0100, Chris Wilson wrote: > > When run as a timer, i915_hangcheck_elapsed() must adhere to all the > > rules of running in a softirq context. This is advantageous to us as we > > want to minimise the risk that a driver bug will prevent us from > > detecting a hung GPU. However, that is irrelevant if the driver bug > > prevents us from resetting and recovering. Still it is prudent not to > > rely on mutexes inside the checker, but given the coarseness of > > dev->struct_mutex doing so is extremely hard. > > > > Give in and run from a work queue, i.e. outside of softirq. > > > > v2: > > > > The conversion does have one significant change, from the use of > > mod_timer to schedule_delayed_work, means that the time that we execute > > the first hangcheck is fixed and not continually deferred by later work. > > This has the advantage of not allowing userspace to fill the ring before > > hangcheck can finally run. At the same time, it removes the ability for > > the interrupt to defer the hangcheck as well. This is sensible for that > > an interrupt is only for a single engine, whereas we perform hangcheck > > globally, so whilst one ring may have hung, the other could be running > > normally and preventing the hangcheck from firing. > > But doesn't this make it so that we may not detect a hang unless more > work gets submitted constantly? Eg. > > 1. execbuffer batch 1 -> queue hangcheck schedules work > 2. execbuffer batch 2 -> queue hangcheck does nothing > 3. execbuffer batch 3 -> queue hangcheck does nothing > 4. hangcheck expires and sees progress up to batch 2 -> everything is fine 4.b hangcheck rearms itself as there is outstanding wrok > 5. batch 3 hangs 6. hangcheck fires, sees progress, rearms 7. hangcheck fires, sees no progress, shoots the user. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx