Re: [PATCH] watchdog: core: module param to activate watchdog

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Wim,

On Mon, Mar 03, 2014 at 11:19:40AM +0100, Wim Van Sebroeck wrote:
> Hi Markus,
> 
> > Hi Wim,
> > 
> > On Sun, Mar 02, 2014 at 10:43:23AM +0100, Wim Van Sebroeck wrote:
> > > Hi Markus,
> > > 
> > > > Many watchdog driver reset the watchdog device on initialization. This
> > > > is a problem if the watchdog is activated by the bootloader and should
> > > > be active the whole time until the userspace can write to it.
> > > > 
> > > > This patch adds a module parameter (watchdog.activate_first) that
> > > > activates the first registered watchdog. Using this parameter it is
> > > > possible to have an active watchdog during the whole boot process.
> > > > 
> > > > Signed-off-by: Markus Pargmann <mpa@xxxxxxxxxxxxxx>
> > > 
> > > NAK. It is the responsibility of the watchdog device driver to do this
> > > and not that of the core. The core can't know which device(s) are 
> > > running and can't be turned off. So you can have more then 1 device 
> > > that needs to keep going...
> > > 
> > > The normal way do this is by using a timer. Take a look at 
> > > at91sam9_wdt.c . We have a timer there that pings the watchdog if
> > > 1) the watchdog device is not active for userspace (so before opening
> > > /dev/watchdog and after clossing /dev/watchdog correctly)
> > > 2) pings 1/2 or 1/4 of the watchdog heartbeat time as long as the
> > > userspace ping keeps going. (Else it will expire and you trigger a 
> > > reboot).
> > > (Note: the heartbeat is the period after which a watchdog hardware
> > > device will trigger a reboot, timeout is the period for userspace.
> > > So for most devices heartbeat and timeout are the same, but for 
> > > devices where a timer is used (we also use it if the heartbeat value
> > > is too small (i.e. 1 second)) the values have a different meaning.)
> > > 
> > > So the logic for a watchdog device driver is:
> > > at startup the watchdog device driver should make sure that a reboot
> > > can only occur when /dev/watchdog is open. This means that if a
> > > watchdog device is active at boot it should either be stopped (that's
> > > why most drivers have a stop in their init/probe function) or have a 
> > > timer that pings the watchdog as long as it is not active for 
> > > userspace (like at91sam9_wdt.c, via_wdt.c, pika_wdt.c, ...)
> > 
> > My goal is different to those watchdog drivers. I want to use the
> > watchdog device to ensure that kernel and userspace work properly. If
> > one of them fails I want to reset the system and boot into a fallback
> > system. In this case, it doesn't make sense to ping the watchdog
> > through the kernel itself. This may work when the kernel is guaranteed
> > to boot successfully. But the kernel may as well never reach the
> > userspace because of some bug somewhere.
> >
> > Of course it is possible to add this behaviour to a single driver that
> > is used on this specific hardware. But I think the concept of an
> > active watchdog during the whole boot process is not so uncommon. For
> > example if you want to update embedded systems, you may want to have a
> > fallback kernel and rootfs.
> 
> The goal of watchdog device drivers is to reboot when your system get's 
> unstable. So when userspace is "up" and the system has booted correctly.
> What you want is something totaly different and not wanted for most 
> systems because you are going to wreck filesystems: think about system
> booting, getting rebooted, fsck starts, system gets rebooted during fsck,
> ... 
> 
> To achieve what you want you should make sure that your "BIOS" starts
> with the watchdog enabled and with a very long watchdog heartbeat (so
> that it doesn't need to get pinged). Then the kernel can boot and 
> userspace can start and then you can do the necessary with the
> /dev/watchdog functionality. Only thing you need to take care of is 
> that you don't stop the watchdog (via a module param or something 
> similar) during the probe/init of the driver.
> I believe we allready have a winbond driver doing that. Just checked:
> it's w83697hf_wdt.c with the early_disable module parameter.
> 
> So, this is something that needs to be taken into account during
> probe/init of the driver and for each watchdog that needs it.
> And thus not for the first watchdog that get's registered.
> Furthermore: not all watchdog devices have a heartbeat that is long 
> enough to achieve this functionality.

Okay, thanks for the clarification. So I will change the driver and add
an appropriate paramter to it instead as the winbond driver does.

Thanks,

Markus


-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux