On Wed, Jun 08, 2016 at 10:42:58AM +0200, Daniel Vetter wrote: > On Fri, Jun 03, 2016 at 05:08:34PM +0100, Chris Wilson wrote: > > We can forgo queuing the hangcheck from the start of every request to > > until we wait upon a request. This reduces the overhead of every > > request, but may increase the latency of detecting a hang. Howeever, if > > nothing every waits upon a hang, did it ever hang? It also improves the > > robustness of the wait-request by ensuring that the hangchecker is > > indeed running before we sleep indefinitely (and thereby ensuring that > > we never actually sleep forever waiting for a dead GPU). > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > I think this will run into TDR patches, where we want a super-low-latency > hangcheck in some cases. But then I think that's implemented by wrapping > the batch in some special cs commands to insta-kill the engine if the > timeout expired, so probably not a big problem. Still worth it to > double-check with Mika I'd say. Exactly. With TDR, hangcheck is relegated to denial of service protection. This does not conflict with TDR, they act as complementary. With timelines, we probably want to go even further and completely divorce checking GPU state for hangcheck from checking for timeline advancement. There simply asking if the waiter has been stuck dramatically simplifies everything. TDR is again complementary, but hangcheck still functions in case TDR fails or is disabled. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx