Many watchdog drivers use watchdog_stop_on_reboot() helper in order to stop the watchdog on system reboot. Unfortunately, this logic is coded in driver's probe function and doesn't allows user to decide what to do during shutdown/reboot. On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) may be configured to either send NMI or turn off/reboot VM as the watchdog action. As the kernel may stuck at any state, sending NMIs can't reliably reboot the VM. At Arista, we benefited from the following set-up: the emulated watchdogs trigger VM reset and softdog is set to catch less severe conditions to generate vmcore. Just before reboot watchdog's timeout is increased to some good-enough value (3 mins). That keeps watchdog always running and guarantees that VM doesn't stuck. Provide new WDIOS_RUN_ON_REBOOT and WDIOS_STOP_ON_REBOOT ioctl options to set up strategy on reboot. Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx> --- drivers/watchdog/watchdog_dev.c | 12 ++++++++++++ include/linux/watchdog.h | 6 ++++++ include/uapi/linux/watchdog.h | 3 ++- 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 8b5c742f24e8..c854cd0245db 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -753,6 +753,18 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd, } if (val & WDIOS_ENABLECARD) err = watchdog_start(wdd); + + if (val & WDIOS_RUN_ON_REBOOT) { + if (val & WDIOS_STOP_ON_REBOOT) { + err = -EINVAL; + break; + } + watchdog_run_on_reboot(wdd); + err = 0; + } else if (val & WDIOS_STOP_ON_REBOOT) { + watchdog_stop_on_reboot(wdd); + err = 0; + } break; case WDIOC_KEEPALIVE: if (!(wdd->info->options & WDIOF_KEEPALIVEPING)) { diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index 417d9f37077a..9e2ca7754631 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -150,6 +150,12 @@ static inline void watchdog_stop_on_reboot(struct watchdog_device *wdd) set_bit(WDOG_STOP_ON_REBOOT, &wdd->status); } +/* Use the following function to keep the watchdog running on reboot */ +static inline void watchdog_run_on_reboot(struct watchdog_device *wdd) +{ + clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status); +} + /* Use the following function to stop the watchdog when unregistering it */ static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd) { diff --git a/include/uapi/linux/watchdog.h b/include/uapi/linux/watchdog.h index b15cde5c9054..bf19a5d3c987 100644 --- a/include/uapi/linux/watchdog.h +++ b/include/uapi/linux/watchdog.h @@ -53,6 +53,7 @@ struct watchdog_info { #define WDIOS_DISABLECARD 0x0001 /* Turn off the watchdog timer */ #define WDIOS_ENABLECARD 0x0002 /* Turn on the watchdog timer */ #define WDIOS_TEMPPANIC 0x0004 /* Kernel panic on temperature trip */ - +#define WDIOS_RUN_ON_REBOOT 0x0008 /* Keep watchdog enabled on reboot */ +#define WDIOS_STOP_ON_REBOOT 0x0010 /* Turn off the watchdog on reboot */ #endif /* _UAPI_LINUX_WATCHDOG_H */ -- 2.25.0