On Thu, Apr 18, 2013 at 09:52:57AM -0400, Don Zickus wrote: > On Thu, Apr 18, 2013 at 06:49:04AM -0700, Guenter Roeck wrote: > > On Thu, Apr 18, 2013 at 09:00:09AM -0400, Don Zickus wrote: > > > On Wed, Apr 17, 2013 at 02:49:59PM -0700, Eric W. Biederman wrote: > > > > Don Zickus <dzickus at redhat.com> writes: > > > > > > > > > A common problem with kdump is that during the boot up of the > > > > > second kernel, the hardware watchdog times out and reboots the > > > > > machine before a vmcore can be captured. > > > > > > > > > > Instead of tellling customers to disable their hardware watchdog > > > > > timers, I hacked up a hook to put in the kdump path that provides > > > > > one last kick before jumping into the second kernel. > > > > > > > > > > The assumption is the watchdog timeout is at least 10-30 seconds > > > > > long, enough to get the second kernel to userspace to kick the watchdog > > > > > again, if needed. > > > > > > > > Why not double the watchdog timeout? and/or pet the watchdog a little > > > > more frequently. > > > > > > I am not sure if the watchdog timeouts can be doubled. I think Guenter > > > was saying some have a max of a couple seconds?? Petting a little more > > > frequently might be an option. Guenter can that be done with a softdog > > > option? > > > > > Most watchdog driver permit at least a minute. Some are more limited. > > Worst I have seen is the BookE watchdog timer (non-Freescale version) > > which has a maximum of three seconds. But that is broken anyway. > > > > Most hardware watchdogs implement a softdog on top of the hardware watchdog > > if the hardware needs to be pinged faster than every 60 seconds. > > > > So, yes, for the most common case you should actually be able to live with a, > > say, 30-60 second timeout which is pinged at least every 5-10 seconds. I thought > > that somehow did not work in your case. Maybe a misunderstanding ? > > No, that will probably work. It is my misunderstanding. Is there a > common way to check the timeout length and the ping frequency? > Usually it is configured in /etc/watchdog.conf if the watchdog package is installed. The standard ping interval is "interval", the timeout is "watchdog-timeout". See "man watchdog.conf" for details. Minimum and maximum values for a given watchdog driver are not exported to user space, so you would have to look into the driver sources to find out what they are. Guenter