Re: [PATCH] Documentation: coding-style: don't encourage WARN*()

David Hildenbrand <david@xxxxxxxxxx> · Fri, 19 Apr 2024 09:16:41 +0200

On 14.04.24 19:08, Alex Elder wrote:
Several times recently Greg KH has admonished that variants of WARN()
should not be used, because when the panic_on_warn kernel option is set,
their use can lead to a panic. His reasoning was that the majority of
Linux instances (including Android and cloud systems) run with this option
enabled. And therefore a condition leading to a warning will frequently
cause an undesirable panic.

The "coding-style.rst" document says not to worry about this kernel
option.  Update it to provide a more nuanced explanation.

Signed-off-by: Alex Elder <elder@xxxxxxxxxx>
---
  Documentation/process/coding-style.rst | 21 +++++++++++----------
  1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst
index 9c7cf73473943..bce43b01721cb 100644
--- a/Documentation/process/coding-style.rst
+++ b/Documentation/process/coding-style.rst
@@ -1235,17 +1235,18 @@ example. Again: WARN*() must not be used for a condition that is expected
  to trigger easily, for example, by user space actions. pr_warn_once() is a
  possible alternative, if you need to notify the user of a problem.
  
-Do not worry about panic_on_warn users
-**************************************
+The panic_on_warn kernel option
+********************************
  
-A few more words about panic_on_warn: Remember that ``panic_on_warn`` is an
-available kernel option, and that many users set this option. This is why
-there is a "Do not WARN lightly" writeup, above. However, the existence of
-panic_on_warn users is not a valid reason to avoid the judicious use
-WARN*(). That is because, whoever enables panic_on_warn has explicitly
-asked the kernel to crash if a WARN*() fires, and such users must be
-prepared to deal with the consequences of a system that is somewhat more
-likely to crash.
+Note that ``panic_on_warn`` is an available kernel option. If it is enabled,
+a WARN*() call whose condition holds leads to a kernel panic.  Many users
+(including Android and many cloud providers) set this option, and this is
+why there is a "Do not WARN lightly" writeup, above.
+
+The existence of this option is not a valid reason to avoid the judicious
+use of warnings. There are other options: ``dev_warn*()`` and ``pr_warn*()``
+issue warnings but do **not** cause the kernel to crash. Use these if you
+want to prevent such panics.
  
  Use BUILD_BUG_ON() for compile-time assertions
  **********************************************
Did you even read the history about that? Likely not, otherwise I wouldn't
have to learn about this patch on lwn.net.

I suggest reading:

commit 1cfd9d7e43d5a1cf739d1420b10b1e65feb02f88
Author: David Hildenbrand <david@xxxxxxxxxx>
Date:   Fri Sep 23 13:34:24 2022 +0200

    coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")


which includes links to relevant discussions between me and Linus. Most
relevant to the discussion is [1].

All that's written in the document right now (use WARN_ON_ONCE() *lightly*) is precisely
what I still think we should do. That's the case *1.5 years* after I documented that.

Clear NACK from my side: "If you set 'panic_on_warn' you get to keep both
pieces when something breaks." [1]

[1] https://lore.kernel.org/all/CAHk-=wgF7K2gSSpy=m_=K3Nov4zaceUX9puQf1TjkTJLA2XC_g@xxxxxxxxxxxxxx/


--
Cheers,

David / dhildenb