On Tue, 17 Jul 2012 12:12:39 +0100 Chris Wilson <chris at chris-wilson.co.uk> wrote: > On Mon, 16 Jul 2012 11:51:55 -0700, Ben Widawsky <ben at bwidawsk.net> wrote: > > Pros: > > * Potential for per batch, or ring watchdog values. I believe when/if we > > get to GPGPU workloads, this is particularly interesting. > > * Batch granularity hang detection. This mostly just makes hang > > detection and recovery a bit easier IMO. > > > > Cons: > > * Blit ring doesn't have an interrupt. This means we still need the > > software watchdog, and it makes hang detection more complex. I've been > > led to believe future HW *may* have this interrupt. > > * Semaphores > > Replacing the black magic for INSTDONE hang detection does seem like a > sensible plan, but as long as we require the hangcheck timer we are only > adding code complexity. So there really needs to a be a compelling > advantage for the watchdoy, something that we cannot acheive with the > existing method. Just to be clear, INSTDONE can go away. I don't think it's valuable for the blitter. > > For me, the criteria is whether we ever miss a hang or falsely accuse > the hw of stopping. If I understand the watchdog correctly, it basically > ensures the batch completes within a certain interval which we can > codify into the existing hangcheck, so no USP. Yeah. If we follow the windows model, I think we just tweak the value until we find something, "good" and just always reset on the timeout instead of doing instdone-foo. > > Or is there more magic waiting in the wings? > -Chris > The magic was only a more straightforward way of finding the batch to blame, and as I said on IRC, when I started I was planning to gut the whole SW watchdog; that was the magic. FWIW I think we may see the interrupt in future products; so it may still be worth considering whether we want to move in this direction. -- Ben Widawsky, Intel Open Source Technology Center