On Sun, Aug 28, 2022 at 6:56 PM Dave Young <dyoung@xxxxxxxxxx> wrote: > > > John mentioned PANIC_ON(). > > I would vote for PANIC_ON(), it sounds like a good idea, because > BUG_ON() is not obvious and, PANIC_ON() can alert the code author that > this will cause a kernel panic and one will be more careful before > using it. People, NO. We're trying to get rid of BUG_ON() because it kills the machine. Not replace it with another bogus thing that kills a machine. So no PANIC_ON(). We used to have "panic()" many many years ago, we got rid of it. We're not re-introducing it. People who want to panic on warnings can do so. WARN_ON() _becomes_ PANIC for those people. But those people are the "we have a million machines, we want to just fail things on any sign of trouble, and we have MIS people who can look at the logs". And it's not like we need to get rid of _all_ BUG_ON() cases. If you have a "this is major internal corruption, there's no way we can continue", then BUG_ON() is appropriate. It will try to kill that process and try to keep the machine running, and again, the kind of people who don't care about one machine (because - again - they have millions of them) can just turn that into a panic-and-reboot situation. But the kind of people for whom the machine they are on IS THEIR ONLY MACHINE - whether it be a workstation, a laptop, or a cellphone - there is absolutely zero situation where "let's just kill the machine" is *EVER* approproate. Even a BUG_ON() will try to continue as well as it can after killing the current thread, but it's going to be iffy, because locking etc. So WARN_ON_ONCE() is the thing to aim for. BUG_ON() is the thing for "oops, I really don't know what to do, and I physically *cannot* continue" (and that is *not* "I'm too lazy to do error handling"). There is no room for PANIC. None. Ever. The only thing there is are "I don't care about this machine because I've got 999,999 other machines, so I'd rather take one machine offline for analysis". Understand? The "should I panic and reboot" is fundamentally not about the code, and it's not a choice that the kernel code gets to make. It's purely about the choice of the person maintaining the machine. As a kernel developer, you do not EVER get to say "panic" or "kill the machine". End of story. Linus