On Tue, 31 Oct 2023 at 15:08, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > Here are the ways forward I can see: > > 1. Status quo. This has all the issues that you call out. > People will hurt themselves with it and consume time and effort. > So let's not do this. Well, at a *minimum*, I really want that notifier chain call to be done *after* the core printk's. That way, if it deadlocks or does something else stupid, at least the core printouts make it out. IOW, I think the notifier should be done perhaps just before the "panic_on_rcu_stall()" call, not at the top before you've even reported any stall conditions at all. And yes, I think the trace_rcu_stall_warning() might be better off later too, but at least trace events are things that get regular testing in nasty conditions (including NMI etc), so I'm *much* less worried about those than about "random developers who think they know what they do and add a notifier". And yes, I do think the notifier should be narrowed down a lot, if you actually want to keep it. I did not actually hear you say that there is a good use-case for it. I only saw you say "Those of us who need this", without showing *any* kind of indication of why anybody would use it in reality. Why the secrecy? There is certainly no current user, nor any description of what a user would be and what makes that notifier useful. The commit message also just says "It is sometimes helpful" and some strange reference to "the subsystem causing the stall to dump its state". It all sounds very fishy. Why would anybody ever have a known subsystem causing RCU stalls? Except, of course, for the rcutorture testing. Anyway, that all absolutely SCREAMS to me "this is not something useful in any normal kernel", and so yes: > 3. Add a default-n Kconfig option that depends on RCU_EXPERT > and KEBUG_KERNEL, so that these problems can only arise in > specially built kernels. > > 4. Same as #3, but use a kernel boot parameter instead of a > Kconfig option. let's make it clear that this is *not* something that any upstream kernel would ever do, and the *only* possible use for it is some kind of external temporary debug patch. See why I so hate things like this? Let's head off any crazy use long *long* before somebody decides that "Oh, I want to use this". Linus