On Thu, Oct 20, 2016 at 10:46:01AM +0100, Chris Wilson wrote: > On Thu, Oct 20, 2016 at 11:29:05AM +0200, Daniel Vetter wrote: > > On Thu, Oct 20, 2016 at 10:07:39AM +0100, Chris Wilson wrote: > > > For the basic error state, we only desire that an error state be created > > > following a hang. For that purpose, we do not need a real hang (slow > > > 6-12s) but can inject one instead (fast <1s). > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > > Should we instead speed up hangcheck? I think there's lots of value in > > making sure not just error dumping, but also hang detection works somewhat > > in BAT. Since if it doesn't any attempt at a full run will lead to pretty > > serious disasters. And I have this dream that BAT is the gating thing > > deciding whether a patch series deserves a complete pre-merge run ;-) > > We have full-hang detection in BAT elsewhere as well. This particular > test was only asking the question "do we generate an error state", hence > why I felt it was safe to just do that and skip a simulated hang. > > > But since this is a controlled enviromnent we could make hangcheck > > super-fast at timing out with some debugfs knob. Would probably also help > > a lot with speeding up the gazillion of testcases in gem_reset_stats. > > I have considered i915.hangcheck_interval_ms many a time. It is not just > the interval but the hangcheck score threshold to consider. If we can > trust our activity detection, we would be safe with a hangcheck every > jiffie (at some overhead mind you), but we would declare a dos too soon. Thinking of which, Mika did have some patches to move towards a time accrued metric... -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx