On 11/20/2024 8:30 AM, Borislav Petkov wrote: > @@ -3254,9 +3306,68 @@ > devices can be requested on-demand with the > /dev/loop-control interface. > > - mce [X86-32] Machine Check Exception > + mce= [X86-{32,64}] > + > + Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables. > + > + off: disable machine check > + > + no_cmci: disable CMCI(Corrected Machine Check > + Interrupt) that Intel processor supports. Usually > + this disablement is not recommended, but it might be > + handy if your hardware is misbehaving. > + > + Note that you'll get more problems without CMCI than > + with due to the shared banks, i.e. you might get > + duplicated error logs. > + > + dont_log_ce: don't make logs for corrected errors. > + All events reported as corrected are silently cleared > + by OS. This option will be useful if you have no > + interest in any of corrected errors. > + > + ignore_ce: disable features for corrected errors, e.g. > + polling timer and CMCI. All events reported as > + corrected are not cleared by OS and remained in its > + error banks. > + > + Usually this disablement is not recommended, however > + if there is an agent checking/clearing corrected > + errors (e.g. BIOS or hardware monitoring > + applications), conflicting with OS's error handling, > + and you cannot deactivate the agent, then this option > + will be a help. > + > + no_lmce: do not opt-in to Local MCE delivery. Use > + legacy method to broadcast MCEs. > + > + bootlog: enable logging of machine checks left over > + from booting. Disabled by default on AMD Fam10h and > + older because some BIOS leave bogus ones. > + > + If your BIOS doesn't do that it's a good idea to > + enable though to make sure you log even machine check > + events that result in a reboot. On Intel systems it is > + enabled by default. > + > + nobootlog: disable boot machine check logging. > + > + monarchtimeout (number): sets the time in us to wait > + for other CPUs on machine checks. 0 to disable. > + > + bios_cmci_threshold: don't overwrite the bios-set CMCI > + threshold. This boot option prevents Linux from > + overwriting the CMCI threshold set by the bios. > + Without this option, Linux always sets the CMCI > + threshold to 1. Enabling this may make memory > + predictive failure analysis less effective if the bios > + sets thresholds for memory errors since we will not > + see details for all errors. > + > + recovery: force-enable recoverable machine check code paths > + > + Everything else is in sysfs now. > Instead of double tabs and <option>: <description>, would this be more readable if the options and their descriptions are separated? Something like the below wouldn't increase over width either. mce= [X86-{32,64}] Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables. off disable machine check no_cmci disable CMCI(Corrected Machine Check Interrupt) that Intel processor supports. Usually this disablement is not recommended, but it might be handy if your hardware is misbehaving. Note that you'll get more problems without CMCI than with due to the shared banks, i.e. you might get duplicated error logs. dont_log_ce don't make logs for corrected errors. All events reported as corrected are silently cleared by OS. This option will be useful if you have no interest in any of corrected errors. ignore_ce disable features for corrected errors, e.g. polling timer and CMCI. All events reported as corrected are not cleared by OS and remained in its error banks. Usually this disablement is not recommended, however if there is an agent checking/clearing corrected errors (e.g. BIOS or hardware monitoring applications), conflicting with OS's error handling, and you cannot deactivate the agent, then this option will be a help. no_lmce do not opt-in to Local MCE delivery. Use legacy method to broadcast MCEs. bootlog: enable logging of machine checks left over from booting. Disabled by default on AMD Fam10h and older because some BIOS leave bogus ones. If your BIOS doesn't do that it's a good idea to enable though to make sure you log even machine check events that result in a reboot. On Intel systems it is enabled by default. nobootlog disable boot machine check logging. monarchtimeout (number) sets the time in us to wait for other CPUs on machine checks. 0 to disable. bios_cmci_threshold don't overwrite the bios-set CMCI threshold. This boot option prevents Linux from overwriting the CMCI threshold set by the bios. Without this option, Linux always sets the CMCI threshold to 1. Enabling this may make memory predictive failure analysis less effective if the bios sets thresholds for memory errors since we will not see details for all errors. recovery force-enable recoverable machine check code paths Everything else is in sysfs now. > - mce=option [X86-64] See Documentation/arch/x86/x86_64/boot-options.rst > > @@ -5701,6 +5825,47 @@ > reboot_cpu is s[mp]#### with #### being the processor > to be used for rebooting. > > + acpi: Use the ACPI RESET_REG in the FADT. If ACPI is > + not configured or the ACPI reset does not work, the > + reboot path attempts the reset using the keyboard > + controller. > + > + bios: Use the CPU reboot vector for warm reset > + > + cold: Set the cold reboot flag > + > + default: There are some built-in platform specific > + "quirks" - you may see: "reboot: <name> series board > + detected. Selecting <type> for reboots." In the case > + where you think the quirk is in error (e.g. you have > + newer BIOS, or newer board) using this option will > + ignore the built-in quirk table, and use the generic > + default reboot actions. > + > + efi: Use efi reset_system runtime service. If EFI is > + not configured or the EFI reset does not work, the > + reboot path attempts the reset using the keyboard > + controller. > + > + force: Don't stop other CPUs on reboot. This can make > + reboot more reliable in some cases. > + > + kbd: Use the keyboard controller. cold reset (default) > + > + pci: Use a write to the PCI config space register > + 0xcf9 to trigger reboot. > + > + triple: Force a triple fault (init) > + > + warm: Don't set the cold reboot flag > + > + Using warm reset will be much faster especially on big > + memory systems because the BIOS will not go through > + the memory check. Disadvantage is that not all > + hardware will be completely reinitialized on reboot so > + there may be boot problems on some systems. > + > + Same suggestion here.