On 29.08.22 05:07, Linus Torvalds wrote: > On Sun, Aug 28, 2022 at 6:56 PM Dave Young <dyoung@xxxxxxxxxx> wrote: >> >>> John mentioned PANIC_ON(). >> >> I would vote for PANIC_ON(), it sounds like a good idea, because >> BUG_ON() is not obvious and, PANIC_ON() can alert the code author that >> this will cause a kernel panic and one will be more careful before >> using it. > > People, NO. > > We're trying to get rid of BUG_ON() because it kills the machine. > > Not replace it with another bogus thing that kills a machine. > > So no PANIC_ON(). We used to have "panic()" many many years ago, we > got rid of it. We're not re-introducing it. > > People who want to panic on warnings can do so. WARN_ON() _becomes_ > PANIC for those people. But those people are the "we have a million > machines, we want to just fail things on any sign of trouble, and we > have MIS people who can look at the logs". > > And it's not like we need to get rid of _all_ BUG_ON() cases. If you > have a "this is major internal corruption, there's no way we can > continue", then BUG_ON() is appropriate. It will try to kill that > process and try to keep the machine running, and again, the kind of > people who don't care about one machine (because - again - they have > millions of them) can just turn that into a panic-and-reboot > situation. > > But the kind of people for whom the machine they are on IS THEIR ONLY > MACHINE - whether it be a workstation, a laptop, or a cellphone - > there is absolutely zero situation where "let's just kill the machine" > is *EVER* approproate. Even a BUG_ON() will try to continue as well as > it can after killing the current thread, but it's going to be iffy, > because locking etc. > > So WARN_ON_ONCE() is the thing to aim for. BUG_ON() is the thing for > "oops, I really don't know what to do, and I physically *cannot* > continue" (and that is *not* "I'm too lazy to do error handling"). > > There is no room for PANIC. None. Ever. Let me clearer what I had in mind, avoiding the PANIC_ON terminology John raised. I was wondering if it would make sense to 1) Be able to specify a severity for WARN (developer decision) 2) Be able to specify a severity for panic_on_warn (admin decision) Distributions with kdump could keep a mode whereby severe warnings (e.g., former BUG_ON) would properly kdump+reboot and harmless warnings (e.g., clean recovery path) would WARN once + continue. I agree that whether to panic should in most cases be a decision of the admin, not the developer. Now, that's a different discussion then the documentation update at hand, and I primary wanted to raise awareness for the kdump people, and ask them if a stronger move towards WARN_ON_ONCE will affect them/customer expectations. I'll work with John to document the current rules to reflect everything you said here. -- Thanks, David / dhildenb