On Tue, Oct 31, 2023 at 06:07:57PM -0700, Paul E. McKenney wrote: > On Tue, Oct 31, 2023 at 01:06:44PM -1000, Linus Torvalds wrote: [ . . . ] > > I really think that we should *never* have any kind of notifiers for > > kernel bugs. They cause problems. The *one* exception is an actual > > honest-to-goodness kernel debugger, and then it should literally > > *only* be the debugger that can register a notifier, so that you are > > *never* in the situation that a kernel without a debugger will just > > hang because of some bogus debug notifier. Here you might have been suggesting that I use gdb and just set a breakpoint in check_cpu_stall(), and then use gdb commands to read out the state. And yes, this work well in some situations. In fact, there is a --gdb parameter to the rcutorture scripting for just this purpose. Except that I normally run a few hundred rcutorture guest OSes spread across 20 systems, and sometimes more than a thousand guest OSes across 50 systems for hard-to-reproduce bugs. In my experience, managing that many remote gdb sessions is cranky and unreliable, which is not helpful when debugging. Writing a few tens of lines of C code in the kernel is much simpler and more reliable. Assuming of course that I avoid the traps you point out. Which I have done thus far. (Famous last words...) Thanx, Paul