Re: watchdog: how to enable?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 17, 2019 at 3:12 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On 11/16/19 10:34 AM, Muni Sekhar wrote:
> > On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>
> >> On 11/15/19 7:03 PM, Muni Sekhar wrote:
> >> [ ... ]
> >>>>
> >>>> Another possibility, of course, might be to enable a hardware watchdog
> >>>> in your system (assuming it supports one). I personally would not trust
> >>>> the NMI watchdog because to detect a system hang, after all, there are
> >>>> situations where even NMIs no longer work.
> >>>
> >>> >From dmesg , Is it possible to know whether my system supports
> >>> hardware watchdog or not?
> >>> I assume that my system supports the hardware watchdog , then how to
> >>> enable the hardware watchdog to debug the system freeze issues?
> >>>
> >>
> >> Hardware watchdog support really depends on the board type. Most PC
> >> mainboards support a watchdog in the Super-IO chip, but on some it is
> >> not wired correctly. On embedded boards it is often built into the SoC.
> >> The easiest way to see if you have a watchdog would be to check for the
> >> existence of /dev/watchdog. However, on a PC that would most likely
> >> not be there because the necessary module is not auto-loaded.
> >> If you tell us your board type, or better the Super-IO chip on the board,
> >> we might be able to help.
> >
> > I’m having two same configuration systems, in one system I installed
> > the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
> > nodes. In other system I’m running with ubuntu distribution kernel,
> > but I don’t see any watchdog device node. So it looks like I need to
> > manually load the kernel module in distro kernel. Is there a way to
> > know what is the corresponding kernel module for  /dev/watchdog node?
> >
> > # ls -l /dev/watchdog*
> > crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
> > crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0
> >
> > # ps -ax | grep watchdog
> >    678 ?        S      0:00 [watchdogd]
> >
> > Regarding Super-IO chip, how to find out the Super-IO chip model?
> >
> You could try to run sensors-detect (from the "sensors" package).
>
> If you can boot a system with /dev/watchdog0, you should see the type
> in /sys/class/watchdog/watchdog0/identity.
I could not find the /sys/class/watchdog/watchdog0/identity and
/sys/class/watchdog/watchdog0/timeout files.
$ ls -l /sys/class/watchdog/watchdog0/
total 0
-r--r--r-- 1 root root 4096 Nov 18 15:12 dev
lrwxrwxrwx 1 root root    0 Nov 18 15:12 device -> ../../../iTCO_wdt.0.auto
drwxr-xr-x 2 root root    0 Nov 18 15:12 power
lrwxrwxrwx 1 root root    0 Nov 18 14:53 subsystem ->
../../../../../../class/watchdog
-rw-r--r-- 1 root root 4096 Nov 18 14:53 uevent

>
> Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
> assuming the watchdog daemon is not running. The watchdog works if the
> system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
> is the timeout in seconds).
sudo cat /dev/watchdog perfectly rebooted my system. I don't see
timeout node, how do I configure the timeout value?
>
> >>
> >> Note though that this won't help to debug the problem. A hardware
> >> watchdog resets the system. It helps to recover, but it is not intended
> >> to help with debugging.
> > How do I use the hardware watchdog to reset my system when system is
> > frozen? It helps me to collect the crashdump and finally helps me to
> > find the root cause for the system frozen issue.
> >
> There won't be a crashdump. It just hard-resets the system.
So is there any other solution to capture the crashdump or trigger
soft reboot once kernel is lockedup?
>
> Guenter



-- 
Thanks,
Sekhar




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux