Re: watchdog: how to enable?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/16/19 10:34 AM, Muni Sekhar wrote:
On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On 11/15/19 7:03 PM, Muni Sekhar wrote:
[ ... ]

Another possibility, of course, might be to enable a hardware watchdog
in your system (assuming it supports one). I personally would not trust
the NMI watchdog because to detect a system hang, after all, there are
situations where even NMIs no longer work.

>From dmesg , Is it possible to know whether my system supports
hardware watchdog or not?
I assume that my system supports the hardware watchdog , then how to
enable the hardware watchdog to debug the system freeze issues?


Hardware watchdog support really depends on the board type. Most PC
mainboards support a watchdog in the Super-IO chip, but on some it is
not wired correctly. On embedded boards it is often built into the SoC.
The easiest way to see if you have a watchdog would be to check for the
existence of /dev/watchdog. However, on a PC that would most likely
not be there because the necessary module is not auto-loaded.
If you tell us your board type, or better the Super-IO chip on the board,
we might be able to help.

I’m having two same configuration systems, in one system I installed
the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
nodes. In other system I’m running with ubuntu distribution kernel,
but I don’t see any watchdog device node. So it looks like I need to
manually load the kernel module in distro kernel. Is there a way to
know what is the corresponding kernel module for  /dev/watchdog node?

# ls -l /dev/watchdog*
crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0

# ps -ax | grep watchdog
   678 ?        S      0:00 [watchdogd]

Regarding Super-IO chip, how to find out the Super-IO chip model?

You could try to run sensors-detect (from the "sensors" package).

If you can boot a system with /dev/watchdog0, you should see the type
in /sys/class/watchdog/watchdog0/identity.

Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
assuming the watchdog daemon is not running. The watchdog works if the
system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
is the timeout in seconds).


Note though that this won't help to debug the problem. A hardware
watchdog resets the system. It helps to recover, but it is not intended
to help with debugging.
How do I use the hardware watchdog to reset my system when system is
frozen? It helps me to collect the crashdump and finally helps me to
find the root cause for the system frozen issue.

There won't be a crashdump. It just hard-resets the system.

Guenter



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux