Re: [PATCH] watchdog: core: module param to activate watchdog

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Markus,

> Hi Wim,
> 
> On Sun, Mar 02, 2014 at 10:43:23AM +0100, Wim Van Sebroeck wrote:
> > Hi Markus,
> > 
> > > Many watchdog driver reset the watchdog device on initialization. This
> > > is a problem if the watchdog is activated by the bootloader and should
> > > be active the whole time until the userspace can write to it.
> > > 
> > > This patch adds a module parameter (watchdog.activate_first) that
> > > activates the first registered watchdog. Using this parameter it is
> > > possible to have an active watchdog during the whole boot process.
> > > 
> > > Signed-off-by: Markus Pargmann <mpa@xxxxxxxxxxxxxx>
> > 
> > NAK. It is the responsibility of the watchdog device driver to do this
> > and not that of the core. The core can't know which device(s) are 
> > running and can't be turned off. So you can have more then 1 device 
> > that needs to keep going...
> > 
> > The normal way do this is by using a timer. Take a look at 
> > at91sam9_wdt.c . We have a timer there that pings the watchdog if
> > 1) the watchdog device is not active for userspace (so before opening
> > /dev/watchdog and after clossing /dev/watchdog correctly)
> > 2) pings 1/2 or 1/4 of the watchdog heartbeat time as long as the
> > userspace ping keeps going. (Else it will expire and you trigger a 
> > reboot).
> > (Note: the heartbeat is the period after which a watchdog hardware
> > device will trigger a reboot, timeout is the period for userspace.
> > So for most devices heartbeat and timeout are the same, but for 
> > devices where a timer is used (we also use it if the heartbeat value
> > is too small (i.e. 1 second)) the values have a different meaning.)
> > 
> > So the logic for a watchdog device driver is:
> > at startup the watchdog device driver should make sure that a reboot
> > can only occur when /dev/watchdog is open. This means that if a
> > watchdog device is active at boot it should either be stopped (that's
> > why most drivers have a stop in their init/probe function) or have a 
> > timer that pings the watchdog as long as it is not active for 
> > userspace (like at91sam9_wdt.c, via_wdt.c, pika_wdt.c, ...)
> 
> My goal is different to those watchdog drivers. I want to use the
> watchdog device to ensure that kernel and userspace work properly. If
> one of them fails I want to reset the system and boot into a fallback
> system. In this case, it doesn't make sense to ping the watchdog
> through the kernel itself. This may work when the kernel is guaranteed
> to boot successfully. But the kernel may as well never reach the
> userspace because of some bug somewhere.
>
> Of course it is possible to add this behaviour to a single driver that
> is used on this specific hardware. But I think the concept of an
> active watchdog during the whole boot process is not so uncommon. For
> example if you want to update embedded systems, you may want to have a
> fallback kernel and rootfs.

The goal of watchdog device drivers is to reboot when your system get's 
unstable. So when userspace is "up" and the system has booted correctly.
What you want is something totaly different and not wanted for most 
systems because you are going to wreck filesystems: think about system
booting, getting rebooted, fsck starts, system gets rebooted during fsck,
... 

To achieve what you want you should make sure that your "BIOS" starts
with the watchdog enabled and with a very long watchdog heartbeat (so
that it doesn't need to get pinged). Then the kernel can boot and 
userspace can start and then you can do the necessary with the
/dev/watchdog functionality. Only thing you need to take care of is 
that you don't stop the watchdog (via a module param or something 
similar) during the probe/init of the driver.
I believe we allready have a winbond driver doing that. Just checked:
it's w83697hf_wdt.c with the early_disable module parameter.

So, this is something that needs to be taken into account during
probe/init of the driver and for each watchdog that needs it.
And thus not for the first watchdog that get's registered.
Furthermore: not all watchdog devices have a heartbeat that is long 
enough to achieve this functionality.

Kind regards,
Wim.

--
To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux