On Mon, Jul 26, 2021 at 12:45 PM Jan Kiszka <jan.kiszka@xxxxxxxxxxx> wrote: > > On 26.07.21 11:40, Jan Kiszka wrote: > > On 26.07.21 11:19, Mantas Mikulėnas wrote: > >> Hello, > >> > >> I have a Dell Inspiron 15-5547 laptop, with systemd configured to set > >> the watchdog to a 2-minute expiry (due to reasons): > >> > >> # /etc/systemd/system.conf > >> [Manager] > >> RuntimeWatchdogSec=2min > >> > >> So far this setting has worked without problems (including kernels > >> 5.12.15 and 5.13.1); however, with kernel 5.13.4 the system inevitably > >> reboots after a few minutes of uptime. > >> > >> I have tracked the issue down to commit 5e65819a006e "watchdog: > >> iTCO_wdt: Account for rebooting on second timeout" in the 5.13.x > >> branch (commit cb011044e34c upstream). There are no unexpected reboots > >> when running 5.13.4 with this commit reverted. > >> > >> Indeed with the original 5.13.4 kernel, `wdctl` always reports > >> "Timeleft:" counting down from 60 seconds (sometimes very nearly > >> reaching 0), even though "Timeout" is still reported to be 120. > >> > >> (systemd pokes the watchdog as part of its main loop, trying to so > >> approximately "between 1/4 and 1/2" of the configured interval. > >> According to wdctl these pings usually happen every 35-50 seconds but > >> sometimes nearly at the 60-second mark, and thanks to the kernel now > >> also dividing the requested expiry by /2 which systemd is unaware of, > >> sometimes this ends up being a *very* close race to 0.) > >> > >> This is a Haswell-era machine (i7-4510U) and seems to have a "version > >> 0" watchdog: > >> > >> Jul 26 11:34:04 archlinux kernel: Linux version 5.13.4-arch2-1 > >> (linux@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 > >> SMP PREEMPT Thu, 22 Jul 2021 20:46:28 +0000 > >> Jul 26 11:34:14 frost kernel: iTCO_vendor_support: vendor-support=0 > >> Jul 26 11:34:14 frost kernel: iTCO_wdt iTCO_wdt.3.auto: Found a Lynx > >> Point_LP TCO device (Version=2, TCOBASE=0x1860) > >> Jul 26 11:34:14 frost systemd[1]: Using hardware watchdog 'iTCO_wdt', > >> version 0, device /dev/watchdog > >> Jul 26 11:34:14 frost systemd[1]: Set hardware watchdog to 2min. > >> Jul 26 11:34:14 frost kernel: iTCO_wdt iTCO_wdt.3.auto: initialized. > >> heartbeat=30 sec (nowayout=0) > >> > > > > Could you printk SMI_EN(p) in iTCO_wdt_set_timeout() > > (drivers/watchdog/iTCO_wdt.c)? This is where we decide whether SMIs are > > working, thus the countdown will only run once. Apparently, something is > > wrong with the detection on this system. > > > > Wait, found it: > > diff --git a/drivers/watchdog/iTCO_wdt.c b/drivers/watchdog/iTCO_wdt.c > index b3f604669e2c..643c6c2d0b72 100644 > --- a/drivers/watchdog/iTCO_wdt.c > +++ b/drivers/watchdog/iTCO_wdt.c > @@ -362,7 +362,7 @@ static int iTCO_wdt_set_timeout(struct watchdog_device *wd_dev, unsigned int t) > * Otherwise, the BIOS generally reboots when the SMI triggers. > */ > if (p->smi_res && > - (SMI_EN(p) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN)) > + (inl(SMI_EN(p)) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN)) > tmrval /= 2; > > /* from the specs: */ Rebuilt with this and it fixes the issue, thanks. -- Mantas Mikulėnas