On 8/24/22 09:30, David Hildenbrand wrote: ... > So one idea would be to have some kind of "panic_on_warn_with_kdump" mode. > But then, we'd actually crash+kdump even on the most harmless WARN_ON() > conditions, because they all look alike. To compensate, we would need > some kind of "severity" levels of a warning -- at least some kind of > "this is harmless and we can easily recover, but please tell the > developers" vs. "this is real bad and unexpected, capture a dump > immediately instead of trying to recover and eventually failing miserably". > > But then, maybe we really want something like BUG_ON() -- let's call it > CBUG_ON() for simplicity -- but be able to make it be usable in > conditionals (to implement recovery code if easily possible) and make the > runtime behavior configurable. > > if (CBUG_ON(whatever)) > try_to_recover() > > Whereby, for example, "panic_on_cbug" and "panic_on_cbug_with_kdump" > could control the runtime behavior. > > But this is just a braindump and I assume people reading along have other, > better ideas. Especially, a better name for CBUG. > If this direction is pursued (as opposed to just recommending the panic_on_warn approach, which is probably viable as well, btw), then I'd suggest this name: PANIC_ON() It's different than BUG_ON(), because it calls panic() instead of immediately halting on a undefined instruction exception (yes, that's x86-centric, I know). So at least in the better behaved cases, there is a backtrace and a reboot, rather than a mysterious hard lockup. As Mel points out [1], it's not always that much better. But in my experience, this is usually a *lot* better. It's only intended for a few very special cases. Not intended as any sort of assert (which BUG sometimes was used for). This forces a panic(), which is what David is looking for. [1] https://lore.kernel.org/all/20220816094056.x4ldzednboaln3ag@xxxxxxx/ thanks, -- John Hubbard NVIDIA