On Mon, Apr 14, 2014 at 01:03:58PM +0000, Mateo Lozano, Oscar wrote: > > I would add a little more smarts to both the kernel and error-decode. > > In the kernel, we can print the guilty request, which you can then use to > > confirm that it is yours. That seems to me to be a stronger validation of > > gem_error_capture, and a useful bit of information from hangstats that we do > > not expose currently. > > That sounds good. I have to add a number of other things to > i915_gpu_error as part of the Execlists code, so I´ll add a "--- guilty > request" as well and resubmit this test together with the series. If we want this much smarts then we need a properly hanging batch, e.g. like the looping batch used in gem_reset_stats. The problem with that is that this will kill the gpu if reset doesn't work (i.e. gen2/3) so we need to skip this test there. Or maybe split things into 2 subtests and use the properly hanging batch only when we do the extended guilty testing under discussion here. But in any case just checking that the batch is somewhere in the ring (properly masking of lower bits 0-11 ofc) and checking whether the batch is correctl dumped (with the magic value) would catch a lot of the past&present execbuf bugs - we've had issues with dumping fancy values of 0 a lot. For the guilty stuff we have an extensive set of tests in gem_reset_stat using the reset stat ioctl already. And for the occasional "the hang detection logic is busted bug" I think nothing short of a human brain locking at the batch really helps. At least if we want to be somewhat platform agnostic ... So imo the current level of checking loosk Good Enough. But I'm certainly not going to stop you ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx