Hello Guenter, On Tue, Aug 04, 2015 at 09:03:27AM -0700, Guenter Roeck wrote: > On 08/04/2015 08:52 AM, Uwe Kleine-König wrote: > >On Tue, Aug 04, 2015 at 08:31:43AM -0700, Guenter Roeck wrote: > >>On 08/04/2015 05:18 AM, Uwe Kleine-König wrote: > >>>On Mon, Aug 03, 2015 at 07:13:28PM -0700, Guenter Roeck wrote: > >>>>structure. If the configured timeout exceeds half the value of the > >>>>maximum hardware timeout, the watchdog core enables a timer function > >>>>to assist sending keepalive requests to the watchdog driver. > >>>I don't understand why you want to halve the maximum hw-timeout. If my > >>>watchdog has hw-max-timeout = 5s and userspace sets it to 3s there > >>>should be no need for assistance?! I think the implementation is the > >>>other way round? > >>> > >>It is supposed to reflect the _maximum_ timeout. That is different to > >>the time between heartbeats, which is supposed to be less; using half > >>the value of the maximum hardware timeout seemed to be a safe number. > >Right, I got that. With hw-max-timeout = 5s the machine resets after 5s > >not caring for the device. And so pinging repeatedly after 2.5s is fine. > >But if userspace sets a timeout of 3s (probably with the intention to > >ping with a frequency of 1/1.5s) there is no need for worker-assistance, > >because the pings coming in each 1.5s provided by userspace are good > >enough. > > > Yes, that is how it is supposed to work. So for the changelog you want: If the configured timeout exceeds the maximum hardware timeout the watchdog core enables a timer function ... right? > >>>>+static inline bool watchdog_need_worker(struct watchdog_device *wdd) > >>>>+{ > >>>>+ unsigned int hm = wdd->max_hw_timeout_ms; > >>>>+ unsigned int m = wdd->max_timeout * 1000; > >>>>+ > >>>>+ return watchdog_active(wdd) && hm && hm != m && > >>>>+ wdd->timeout * 500 > hm; One problem with the worker I see is that the reset will probably be delayed with your worker. Consider userspace sets timeout = 10 s because if the main application doesn't work for 12 s something dangerous can happen. (Consider a guillotine where the blade can only be hold up for 12 s when not locked. :-) Now if the hw-max-timeout is 9s you setup a timer to ping at $last_keepalive + 4.5 s and $last_keepalive + 9 s (not taking timer and system latency into account). That means the system only resets 18 s after the last userspace ping. Oops. So ideally you send the last auto-ping at $last_keepalive + $configured_timeout - $hw-max-timeout (assuming the hardware is configured for $hw-max-timeout). > >>>I don't understand what max_timeout is now that there is max_hw_timeout. > >>>So I don't understand why you need hm != m either. > >>> > >> > >>Backward compatibility. A driver which does not set max_hw_timeout_ms, > >>or sets both to the same value, by definition expects to handle everything > >>internally, and thus no worker is configured. > >And a driver that does > > > > max_timeout = 5 > > max_hw_timeout = 5125 > > > >falls through the cracks. > > > Hmm - not that this configuration makes any sense, but you are right. > I'll make it "hm < m". It does not? What do you expect max_timeout to be set to if the maximal hw-timeout is 5125 ms? 0 would work, but IMHO you need some more documentation then. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html