Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Thorsten Leemhuis <linux@xxxxxxxxxxxxx> · Fri, 21 Dec 2018 16:26:31 +0100

Hi! Am 17.12.18 um 19:24 schrieb Jonathan Corbet:
> On Mon, 17 Dec 2018 16:20:42 +0100
> Thorsten Leemhuis <linux@xxxxxxxxxxxxx> wrote:
>
>> +might be relevant later when investigating problems. Don't worry
>> +yourself too much about this, most of the time it's not a problem to run
> s/yourself//

Thx for this and other suggestions or fixes, consider them implemented when
not mentioned in this mail. Find the current state of the text at the end of
this mail for reference.

> [...]
>> +At runtime, you can query the tainted state by reading
>> +``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
>> +tainted; any other number indicates the reasons why it is. You might
>> +find that number in below table if there was only one reason that got
>> +the kernel tainted. If there were multiple reasons you need to decode
>> +the number, as it is a bitfield, where each bit indicates the absence or
>> +presence of a particular type of taint. You can use the following python
>> +command to decode::
> Here's an idea if you feel like improving this: rather than putting an
> inscrutable program inline, add a taint_status script to scripts/ that
> prints out the status in fully human-readable form, with the explanation
> for every set bit.

I posted the script earlier today and noticed now that it prints only
the fully human-readable form, not if a bit it set or unset. Would you
prefer if it did that as well?

>> +===  ===  ======  ========================================================
>> +Bit  Log     Int  Reason that got the kernel tainted
>> +===  ===  ======  ========================================================
>> + 1)  G/P       0  proprietary module got loaded
> I'd s/got/was/ throughout.  Also, this is the kernel, we start counting at
> zero! :)

Hehe, yeah :-D At first I actually started at zero, but that looked
odd as the old explanations (those already in the file) start to could at one.
Having a off-by-one within one document is just confusing, that's why I
decided against starting at zero here.

Another reason that came to my mind when reading your comment: Yes, this
is the kernel, but the document should be easy to understand even for
inexperienced users (e.g. people that know how to open and use command
line tools, but never learned programming). That's why I leaning towards
starting with one everywhere. But yes, that can be confusing, that's
why I added a note, albeit I'm not really happy with it yet:

"""
Note: This document is aimed at users and thus starts to count at one here and
in other places.  Use ``seq 0 17`` instead to start counting at zero, as it's
normal for developers.
"""

See below for full context. Anyway: I can change the text to start at zero if
you prefer it.

Ciao, Thorsten

---

Tainted kernels
---------------

The kernel will mark itself as 'tainted' when something occurs that might be
relevant later when investigating problems. Don't worry too much about this,
most of the time it's not a problem to run a tainted kernel; the information is
mainly of interest once someone wants to investigate some problem, as its real
cause might be the event that got the kernel tainted. That's why bug reports
from tainted kernels will often be ignored by developers, hence try to reproduce
problems with an untainted kernel.

Note the kernel will remain tainted even after you undo what caused the taint
(i.e. unload a proprietary kernel module), to indicate the kernel remains not
trustworthy. That's also why the kernel will print the tainted state when it
notices an internal problem (a 'kernel bug'), a recoverable error
('kernel oops') or a non-recoverable error ('kernel panic') and writes debug
information about this to the logs ``dmesg`` outputs. It's also possible to
check the tainted state at runtime through a file in ``/proc/``.

Tainted flag in bugs, oops or panics messages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You find the tainted state near the top in a line starting with 'CPU:'; if or
why the kernel is shown after the Process ID ('PID:') and a shortened name of
the command ('Comm:') that triggered the event:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 4424 Comm: insmod Tainted: P        W  O      4.20.0-0.rc6.fc30 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:my_oops_init+0x13/0x1000 [kpanic]
[...]

You'll find a **'Not tainted: '** there if the kernel was not tainted at the
time of the event; if it was, then it will print **'Tainted: '** and characters
either letters or blanks. The meaning of those characters is explained in the
table below. In above example it's '``Tainted: P        W  O     ``' as as the
kernel got tainted earlier because a proprietary Module (``P``) was loaded, a
warning occurred (``W``), and an externally-built module was loaded (``O``).
To decode other letters use the table below.

Decoding tainted state at runtime
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

At runtime, you can query the tainted state by reading
``cat /proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
tainted; any other number indicates the reasons why it is. The easiest way to
decode that number is the script ``tools/debugging/kernel-chktaint``, which your
distribution might ship as part of a package called ``linux-tools`` or
``kernel-tools``; if it doesn't you can download the script from
`git.kernel.org <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/debugging/kernel-chktaint>`_.
and execute it with ``sh kernel-chktaint``

If you do not want to run that script you can try to decode the number yourself.
That's easy if there was only one reason that got your kernel tainted, as in
this case you can find the number with the table below. If there were multiple
reasons you need to decode the number, as it is a bitfield, where each bit
indicates the absence or presence of a particular type of taint. It's best to
leave that to the aforementioned script, but if you need something quick you can
use this shell command to check which bits are set:

	$ for i in $(seq 18); do echo $i $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done

Note: This document is aimed at users and thus starts to count at one here and
in other places.  Use ``seq 0 17`` instead to start counting at zero, as it's
normal for developers.

Table for decoding tainted state
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

====  ===  ======  ========================================================
Pos.  Log  Number  Reason that got the kernel tainted
====  ===  ======  ========================================================
  1)  G/P       0  proprietary module was loaded
  2)  _/F       2  module was force loaded
  3)  _/S       4  SMP kernel oops on an officially SMP incapable processor
  4)  _/R       8  module was force unloaded
  5)  _/M      16  processor reported a Machine Check Exception (MCE)
  6)  _/B      32  bad page referenced or some unexpected page flags
  7)  _/U      64  taint requested by userspace application
  8)  _/D     128  kernel died recently, i.e. there was an OOPS or BUG
  9)  _/A     256  ACPI table overridden by user
 10)  _/W     512  kernel issued warning
 11)  _/C    1024  staging driver was loaded
 12)  _/I    2048  workaround for bug in platform firmware applied
 13)  _/O    4096  externally-built ("out-of-tree") module was loaded
 14)  _/E    8192  unsigned module was loaded
 15)  _/L   16384  soft lockup occurred
 16)  _/K   32768  Kernel live patched
 17)  _/K   65536  Auxiliary taint, defined for and used by distros
 18)  _/K  131072  Kernel was built with the struct randomization plugin
====  ===  ======  ========================================================

Note: To make reading easier ``_`` is representing a blank in this
table.

More detailed explanation for tainting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 1)  ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
     any proprietary module has been loaded.  Modules without a
     MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
     insmod as GPL compatible are assumed to be proprietary.

 2)  ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
     modules were loaded normally.

 3)  ``S`` if the oops occurred on an SMP kernel running on hardware that
     hasn't been certified as safe to run multiprocessor.
     Currently this occurs only on various Athlons that are not
     SMP capable.

 4)  ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
     modules were unloaded normally.

 5)  ``M`` if any processor has reported a Machine Check Exception,
     ``' '`` if no Machine Check Exceptions have occurred.

 6)  ``B`` if a page-release function has found a bad page reference or
     some unexpected page flags.

 7)  ``U`` if a user or user application specifically requested that the
     Tainted flag be set, ``' '`` otherwise.

 8)  ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.

 9)  ``A`` if the ACPI table has been overridden.

 10) ``W`` if a warning has previously been issued by the kernel.
     (Though some warnings may set more specific taint flags.)

 11) ``C`` if a staging driver has been loaded.

 12) ``I`` if the kernel is working around a severe bug in the platform
     firmware (BIOS or similar).

 13) ``O`` if an externally-built ("out-of-tree") module has been loaded.

 14) ``E`` if an unsigned module has been loaded in a kernel supporting
     module signature.

 15) ``L`` if a soft lockup has previously occurred on the system.

 16) ``K`` if the kernel has been live patched.

 17) ``X`` Auxiliary taint, defined for and used by Linux distributors.

 18) ``T`` Kernel was build with randstruct plugin, which can intentionally
     produce extremely unusual kernel structure layouts (even performance
     pathological ones), which is important to know when debugging. Set at
     build time.