Sorry for the delayed response...I've got some difficult family things to work on IRL that are taking priority... On 11/12/2015 05:23 PM, Timur Tabi wrote: > On 11/12/2015 06:06 PM, Al Stone wrote: >> If it is a NAK, that's fine, but I also want to be sure I understand what the >> objections are. Based on my understanding of the discussion so far over the >> multiple versions, I think the primary objection is that the use of pretimeout >> makes this driver too complex, and indeed complex enough that there is some >> concern that it could destabilize a running system. Do I have that right? > > I don't have a problem with the concept of pre-timeout per se. My primary > objection is this code: > >> +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id) >> +{ >> + struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id; >> + struct watchdog_device *wdd = &gwdt->wdd; >> + >> + /* We don't use pretimeout, trigger WS1 now */ >> + if (!wdd->pretimeout) >> + sbsa_gwdt_set_wcv(wdd, 0); > > This driver depends on an interrupt handler in order to properly program the > hardware. Unlike some other devices, the SBSA watchdog does not need assistance > to reset on a timeout -- it is a "fire and forget" device. What happens if > there is a hard lockup, and interrupts no longer work? Aha. I see now. That helps clarify a lot. Thanks. > The reason why Fu does this is because he wants to support a pre-timeout value > that's independent of the timeout value. The SBSA watchdog is normally > programmed where real timeout equals twice the pre-timeout. I would prefer that > the driver adhere to this limitation. That would eliminate the need to > pre-program the hardware in the interrupt handler. The "normally programmed" limitation described is interesting; forgive my ignorance, but where is that specified? I couldn't find anything that specific in the SBSA, or the ARM ARM, but I could have missed it. That being said, keeping them independent at least seems like a good idea; if I think about kdump/kexec or some other recovery mechanism wanting to perhaps copy part of RAM or flush a filesystem/database, or maybe do some other magic to recover enough to be able to reset the timer, that may be a really long interval on a large server. I could easily see that being very different from a watchdog timer that's meant to just make sure the platform is still making progress. Conversely, I could see that recovery interval being very small or zero on a guest OS, for example, and the watchdog still different. >> And finally, a simpler, single stage timeout watchdog driver would be a >> reasonable thing to accept, yes? I can see where that would make sense. > > I would be okay with merging such a driver, and then enhancing it later to add > pre-timeout support. > >> The issue for me in that case is that the SBSA requires a two stage timeout, >> so a single stage driver has no real value for me. > > There are plenty of existing watchdog devices that have a two-stage timeout but > the driver treats it as a single stage. The PowerPC watchdog driver is like > that. The hardware is programmed for the second stage to cause a hardware > reset, and the interrupt handler is typically a no-op or just a printk(). > Hrm. Thanks for the pointer. I _think_ I see a way to do that with arm64, and perhaps combine this driver's functionality with what Timur did originally, but still have it reasonably straightforward. I need to do the experiments, though, and see if it actually works first. -- ciao, al ----------------------------------- Al Stone Software Engineer Linaro Enterprise Group al.stone@xxxxxxxxxx ----------------------------------- -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html