Hi, On Wed, May 24, 2023 at 6:59 AM Petr Mladek <pmladek@xxxxxxxx> wrote: > > On Fri 2023-05-19 10:18:37, Douglas Anderson wrote: > > The fact that there watchdog_hardlockup_enable(), > > watchdog_hardlockup_disable(), and watchdog_hardlockup_probe() are > > declared __weak means that the configured hardlockup detector can > > define non-weak versions of those functions if it needs to. Instead of > > doing this, the perf hardlockup detector hooked itself into the > > default __weak implementation, which was a bit awkward. Clean this up. > > > > >From comments, it looks as if the original design was done because the > > __weak function were expected to implemented by the architecture and > > not by the configured hardlockup detector. This got awkward when we > > tried to add the buddy lockup detector which was not arch-specific but > > wanted to hook into those same functions. > > > > This is not expected to have any functional impact. > > > > @@ -187,27 +187,33 @@ static inline void watchdog_hardlockup_kick(void) { } > > #endif /* !CONFIG_HARDLOCKUP_DETECTOR_PERF */ > > > > /* > > - * These functions can be overridden if an architecture implements its > > - * own hardlockup detector. > > + * These functions can be overridden based on the configured hardlockdup detector. > > * > > * watchdog_hardlockup_enable/disable can be implemented to start and stop when > > - * softlockup watchdog start and stop. The arch must select the > > + * softlockup watchdog start and stop. The detector must select the > > * SOFTLOCKUP_DETECTOR Kconfig. > > */ > > -void __weak watchdog_hardlockup_enable(unsigned int cpu) > > -{ > > - hardlockup_detector_perf_enable(); > > -} > > +void __weak watchdog_hardlockup_enable(unsigned int cpu) { } > > > > -void __weak watchdog_hardlockup_disable(unsigned int cpu) > > -{ > > - hardlockup_detector_perf_disable(); > > -} > > +void __weak watchdog_hardlockup_disable(unsigned int cpu) { } > > > > /* Return 0, if a hardlockup watchdog is available. Error code otherwise */ > > int __weak __init watchdog_hardlockup_probe(void) > > { > > - return hardlockup_detector_perf_init(); > > + /* > > + * If CONFIG_HAVE_NMI_WATCHDOG is defined then an architecture > > + * is assumed to have the hard watchdog available and we return 0. > > + */ > > + if (IS_ENABLED(CONFIG_HAVE_NMI_WATCHDOG)) > > + return 0; > > + > > + /* > > + * Hardlockup detectors other than those using CONFIG_HAVE_NMI_WATCHDOG > > + * are required to implement a non-weak version of this probe function > > + * to tell whether they are available. If they don't override then > > + * we'll return -ENODEV. > > + */ > > + return -ENODEV; > > } > > When thinking more about it. It is weird that we need to handle > CONFIG_HAVE_NMI_WATCHDOG in this default week function. > > It should be handled in watchdog_hardlockup_probe() implemented > in kernel/watchdog_perf.c. > > IMHO, the default __weak function could always return -ENODEV; > > Would it make sense, please? I don't quite understand. I'd agree that the special case for CONFIG_HAVE_NMI_WATCHDOG is ugly, but it was also ugly before. IMO it's actually a little less ugly / easier to understand after my patch. ...but let me walk through how I think this special case works and maybe you can tell me where I'm confused. The first thing to understand is that CONFIG_HARDLOCKUP_DETECTOR_PERF and CONFIG_HAVE_NMI_WATCHDOG are mutually exclusive from each other. This was true before any of my patches and is still true after them. Specifically, if CONFIG_HAVE_NMI_WATCHDOG is defined then an architecture implements arch_touch_nmi_watchdog() (as documented in the Kconfig docs for HAVE_NMI_WATCHDOG). Looking at the tree before my series you can see that the perf hardlockup detector also implemented arch_touch_nmi_watchdog(). This would have caused a conflict. The mutual exclusion was presumably enforced by an architecture not defining both HAVE_NMI_WATCHDOG and HAVE_HARDLOCKUP_DETECTOR_PERF. The second thing to understand is that an architecture that defines CONFIG_HAVE_NMI_WATCHDOG is _not_ required to implement watchdog_hardlockup_probe() (used to be called watchdog_nmi_probe()). Maybe this should change, but at the very least it appears that SPARC64 defines HAVE_NMI_WATCHDOG but doesn't define watchdog_hardlockup_probe() AKA watchdog_nmi_probe(). Anyone who defines CONFIG_HAVE_NMI_WATCHDOG and doesn't implement watchdog_hardlockup_probe() is claiming that their watchdog needs no probing and is always available. So with that context: 1. We can't handle any special cases for CONFIG_HAVE_NMI_WATCHDOG in "kernel/watchdog_perf.c". The special cases that we need to handle are all for the cases where CONFIG_HARDLOCKUP_DETECTOR_PERF isn't defined and that means "kernel/watchdog_perf.c" isn't included. 2. We can't have the default __weak function return -ENODEV because CONFIG_HAVE_NMI_WATCHDOG doesn't require an arch to implement watchdog_hardlockup_probe(), but we want watchdog_hardlockup_probe() to return "no error" in that case so that "watchdog_hardlockup_available" gets set to true. Does that sound right? I'd agree that a future improvement saying that CONFIG_HAVE_NMI_WATCHDOG means you _must_ implement watchdog_hardlockup_probe() would make sense and that would allow us to get rid of the special case. IMO, though, that's a separate patch. I'd be happy to review that patch if you wanted to post it up. :-) If we want to add that requirement, I _think_ the only thing you'd need to do is to add watchdog_hardlockup_probe() to sparc64 and have it return 0 and put that definition in the same file containing arch_touch_nmi_watchdog(). powerpc also gets CONFIG_HAVE_NMI_WATCHDOG as a side effect of selecting CONFIG_HAVE_HARDLOCKUP_DETECTOR_ARCH but it looks like they implement watchdog_hardlockup_probe() already. Oh, but maybe this will fix a preexisting (existed before my patches) minor bug... Unless I'm missing something (entirely possible!) on powerpc today I guess if you turn off CONFIG_PPC_WATCHDOG then CONFIG_HAVE_HARDLOCKUP_DETECTOR_ARCH and CONFIG_HAVE_NMI_WATCHDOG would still be defined and we'd end up returning 0 (no error) from watchdog_hardlockup_probe(). That means that on powerpc today if you turn off CONFIG_PPC_WATCHDOG that '/proc/sys/kernel/nmi_watchdog' will still think the watchdog is enabled? -Doug