Hi Guenter, On 03/05/2016:06:29:39 AM, Guenter Roeck wrote: > On 05/03/2016 01:20 AM, Pratyush Anand wrote: > >Currently only WOR is used to program both first and second stage which > >provided very limited range of timeout. > > > >This patch uses WCV as well to achieve higher range of timeout. This patch > >programs max_timeout as 255, but that can be increased further as well. > > > >Following testing shows that we can happily achieve 40 second default timeout. > > > > # modprobe sbsa_gwdt action=1 > > [ 131.187562] sbsa-gwdt sbsa-gwdt.0: Initialized with 40s timeout @ 250000000 Hz, action=1. > > # cd /sys/class/watchdog/watchdog0/ > > # cat state > > inactive > > # cat /dev/watchdog0 > > cat: /dev/watchdog0: Invalid argument > > [ 161.710593] watchdog: watchdog0: watchdog did not stop! > > # cat state > > active > > # cat timeout > > 40 > > # cat timeleft > > 38 > > # cat timeleft > > 25 > > # cat /dev/watchdog0 > > cat: /dev/watchdog0: Invalid argument > > [ 184.931030] watchdog: watchdog0: watchdog did not stop! > > # cat timeleft > > 37 > > # cat timeleft > > 21 > > ... > > ... > > # cat timeleft > > 1 > > > >panic() is called upon timeout of 40s. See timestamp of last kick (cat) and > >next panic() message. > > > > [ 224.939065] Kernel panic - not syncing: SBSA Watchdog timeout > > > >Signed-off-by: Pratyush Anand <panand@xxxxxxxxxx> > > You could also use the new infrastructure (specify max_hw_heartbeat_ms instead > of max_timeout), and not depend on the correct implementation of WCV. Thanks for pointing to max_hw_heartbeat_ms. Just gone through it. Certainly it would be helpful, and some part of this patch will go away. In fact after supporting max_hw_heartbeat_ms, there should be no change for action=0 functionally. However, we would still need some changes for action=1. When action=1, isr is called, which calls panic(). Calling panic() will further trigger a dump saving mechanism, which can cause to execute a secondary kernel. Now, it might happen that with the limited timeout (max_hw_heartbeat_ms) programmed in first kernel, we land into a reset before secondary kernel could start kicking it again or would complete dump save. So, in my opinion: (1) We should use max_hw_heartbeat_ms. (2) Then we should overwrite WCV in ISR so that it ensures a timeout of user programmed "timeout" value for hardware reset. ~Pratyush -- To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html