On Tue, May 16, 2017 at 09:43:52AM +0000, Lofstedt, Marta wrote: > > > > -----Original Message----- > > From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] > > Sent: Tuesday, May 16, 2017 12:04 PM > > To: Lofstedt, Marta <marta.lofstedt@xxxxxxxxx> > > Cc: Daniel Vetter <daniel@xxxxxxxx>; Martin Peres > > <martin.peres@xxxxxxxxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > Subject: Re: [PATCH i-g-t] tests/initial_state: Add a test to capture > > the state of the GPU > > > > On Tue, May 16, 2017 at 08:54:51AM +0000, Lofstedt, Marta wrote: > > > > > > > > > > -----Original Message----- > > > > From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] > > > > Sent: Tuesday, May 16, 2017 11:21 AM > > > > To: Lofstedt, Marta <marta.lofstedt@xxxxxxxxx> > > > > Cc: Daniel Vetter <daniel@xxxxxxxx>; Martin Peres > > > > <martin.peres@xxxxxxxxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > > > Subject: Re: [PATCH i-g-t] tests/initial_state: Add a > > > > test to capture the state of the GPU > > > > > > > > On Tue, May 16, 2017 at 07:42:51AM +0000, Lofstedt, Marta wrote: > > > > > I hereby pull-out this patch. > > > > > The idea of it was to know if we were already wedged at the > > > > > beginning of > > > > testing, that would give us information on how to interpret silly > > > > results; such that test starting to get skipped and/or we got > > > > dmesg-warns/incomplete on tests that usually should be skipped. > > > > > Also, we are planning to soon deploy a piglit.conf solution where > > > > > testing > > > > will be terminated on wedged, so I agree that my test isn't really needed. > > > > > > > > Not everything is broken by wedged; internally we just use that as > > > > an indicator that GEM is hosed. KMS should still work, we must still > > > > be able to drive the displays to show the error and keep the servers > > > > alive until the data is saved (and hopefully gracefully degrade that > > > > we don't have to interrupt their immediate session). > > > > > > It doesn't matter if it is broken or not, if we are terminally wedged the rest > > of the result may be silly. Look for example at CI_DRM_2612, the fi-elk-e7500 > > is wedged at igt@gem_busy@basic-hang-default, then all test are skipped > > until gem_exec_reloc@basic-cpu-gtt-noreloc where the machine hangs, but > > it is a gem test so it should have been skipped, right. My conclusion from > > seeing this pattern multiple times is that after terminally wedged, silly things > > can happen, i.e. we can't trust the results, and since we don't want silly bugs, > > the CI testing should be stopped. > > > > The machine didn't hang, it was remotely killed because the run timed out. > How do you know that? The dmesg is a stream of flip timeouts until we run out of total BAT runtime (12 minutes + some startup slack). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx