On Tue, 30 Jan 2018 15:35:54 +1100 Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote: > alexander.levin@xxxxxxxxxxx writes: > > > On Thu, Dec 14, 2017 at 12:10:39AM +1100, Michael Ellerman wrote: > >>alexander.levin@xxxxxxxxxxx writes: > >> > >>> From: Nicholas Piggin <npiggin@xxxxxxxxx> > >>> > >>> [ Upstream commit 064996d62a33ffe10264b5af5dca92d54f60f806 ] > >>> > >>> The SMP hardlockup watchdog cross-checks other CPUs for lockups, which > >>> causes xmon headaches because it's assuming interrupts hard disabled > >>> means no watchdog troubles. Try to improve that by calling > >>> touch_nmi_watchdog() in obvious places where secondaries are spinning. > >>> > >>> Also annotate these spin loops with spin_begin/end calls. > >> > >>These macros didn't exist until 4.13, and haven't been backported AFAIK. > > > > But the touch_nmi_watchdog() bits are something we want in stable, right? > > I don't think you need them unless you've also back ported > arch/powerpc/kernel/watchdog.c, which I don't think you have. > > Maybe Nick can confirm? I'm not 100% sure. The CPUs only check themselves for lockups. They will blow their threshold when in xmon, but when they come out of xmon, I think by a quirk of our local_irq_enable() implementation that actually checks timers explicitly and runs them first before re-enabling hard interrupts, then our heartbeat starts up again just before the perf interrupt would come in to report the lockup. I think. Given that we've had no reports of misbehaviour of the old perf watchdog, I would say you can skip the backport. Thanks, Nick