Re: [PATCH i-g-t] kms_atomic_transition: Output more finegrained progress info to avoid CI watchdog timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 19, 2017 at 06:48:30AM +0000, Lofstedt, Marta wrote:
> 
> 
> > -----Original Message-----
> > From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf
> > Of Daniel Vetter
> > Sent: Wednesday, October 18, 2017 5:36 PM
> > To: Latvala, Petri <petri.latvala@xxxxxxxxx>
> > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> > Subject: Re:  [PATCH i-g-t] kms_atomic_transition: Output more
> > finegrained progress info to avoid CI watchdog timeout
> > 
> > On Wed, Oct 18, 2017 at 02:43:38PM +0300, Petri Latvala wrote:
> > > On Wed, Oct 18, 2017 at 02:29:33PM +0300, Imre Deak wrote:
> > > > The CI software watchdog (owatch) will timeout if the test doesn't
> > > > output anything for a long time on standard out or error. At least
> > > > the plane-all-modeset-transition and
> > > > plane-all-modeset-transition-fences
> > > > subtests run without any output longer than the watchdog timeout, so
> > > > output some more progress info.
> > >
> > > No, owatch is wrapping piglit, and pings the watchdog if _piglit_
> > > prints anything. Which it does on start/exit of a test.
> > 
> > tbh this sounds like owatch being dense and it shouldn't try to reboot this
> > quickly. What's the current owatch timeout?
> > 
> > Aside: What exactly does owatch give us? I thought jenkins also watches
> > machines and reboots them using the ac switch ... And owatch provides
> > spurious reboots?
> Daniel,
> Owatch gives us the knowledge that is was a test that took too long. I.e. we will know that it was not a system hang. 
>  We also know that the NMI watchdog didn't trigger. 
> I believe this is extremely useful information when you are starting to debug the issue
> 
> Imre if you believe that owatch is preventing you from getting
> information to debug why these test are taking so extremely long time,
> it would be easy to increase the timeout or even do runs without it
> being enabled.

It would be useful to have a clear indication if the test really hang or
it just took too long. Right now this isn't obvious without going
through the logs.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux