[RT] hit recently-fixed PREEMPT_RT CFS-bandwidth timer locking issue in the wild

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I thought people might be interested to hear that we recently hit the bug fixed by git commit c0ad4aa4d8 on multiple lab systems running the RHEL 7 "kernel-rt" kernel. (But I think other versions are at risk as well.)

Interestingly, when the bug hit the system just hung completely. Nothing was emitted on netconsole or serial console, neither the hung task timer nor the NMI watchdog triggered, CONFIG_DEBUG_SPINLOCK didn't output anything, and magic sysrq didn't work on the serial console. As you can imagine this was a bit frustrating. I was finally able to cause a panic by sending an NMI from the BMC and that allowed kdump to store the core file so I could get stack traces.

Given how annoying it was to debug, I'd recommend backporting this fix as far back as it applies. HRTIMER_MODE_SOFT was introduced in mainline in 4.16, but at least in the RHEL7 kernel-rt package (and I think in the vanilla PREEMPT_RT patches as well) hrtimers are run by default in softirq context and so the fix might apply to all supported PREEMPT_RT versions.

Chris



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux