On Fri, Nov 10, 2017 at 02:49:25PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Quoting Mika Kuoppala (2017-11-10 12:20:55) > >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > >> > >> > Quoting Mika Kuoppala (2017-11-10 11:53:47) > >> >> We have a problem of distinguishing intended hangs > >> >> submitted by igt during CI/bat and hangs that are nonintended > >> >> happening in close proximity. > >> > > >> > Do we? I haven't had that problem in distinguishing them. > >> > >> Piglit can't tell them apart afaik. Due to info level. > > > > Piglit? If the test passes, it doesn't matter how the kernel got there, > > the user behaviour is as expected. If the test wants to assert that it > > didn't hang, it can do that. > > Through reset counts? At starters we could assert in framework that > all tests that do not call igt_hang() expect reset count to > stay the same between entry/exit. > > I see the logic behind that user behaviour is as expected. > > Would be good that CI folks chime in here and detail how > they want things to work. I'm very vary of having to sprinkle that all over CI tbh, but if it's in the framework I guess it can work too. Will be fun to figure out how to catch unintended hangs in the tests that do provoke hangs, but should be doable. But for adding it to the framework I think we're already putting way too much random quiescent stuff in there, and for generic kms tests there's kinda no need for that. So not entirely sold that this is the best approach we can do. A semi-middleground would be if we have new functions that open a gem fd for rendering, and we have some sanity-checks to make sure that only when you ask for rendering do the igt ioctl wrappers allow you to. Then we could stuff all these checks in there. But that still leaves the issue that a gpu hang on e.g. a s/r test or module reload won't be caught, and we really want to catch these. Module reload btw is also one case where just checking the reset counter will just not work. And module reload is exactly one of these cases where we do want to make sure we don't misprogram the gpu so it dies. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx