Patch "x86/mce: Defer processing of early errors" has been added to the 5.14-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    x86/mce: Defer processing of early errors

to the 5.14-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     x86-mce-defer-processing-of-early-errors.patch
and it can be found in the queue-5.14 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 2af22b7a292e7a7bec68a2bb2a9586e09d630e16
Author: Borislav Petkov <bp@xxxxxxxxx>
Date:   Mon Aug 23 17:31:29 2021 -0700

    x86/mce: Defer processing of early errors
    
    [ Upstream commit 3bff147b187d5dfccfca1ee231b0761a89f1eff5 ]
    
    When a fatal machine check results in a system reset, Linux does not
    clear the error(s) from machine check bank(s) - hardware preserves the
    machine check banks across a warm reset.
    
    During initialization of the kernel after the reboot, Linux reads, logs,
    and clears all machine check banks.
    
    But there is a problem. In:
    
      5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
    
    the call to mce_register_decode_chain() moved later in the boot
    sequence. This means that /dev/mcelog doesn't see those early error
    logs.
    
    This was partially fixed by:
    
      cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers")
    
    which made sure that the logs were not lost completely by printing
    to the console. But parsing console logs is error prone. Users of
    /dev/mcelog should expect to find any early errors logged to standard
    places.
    
    Add a new flag MCP_QUEUE_LOG to machine_check_poll() to be used in early
    machine check initialization to indicate that any errors found should
    just be queued to genpool. When mcheck_late_init() is called it will
    call mce_schedule_work() to actually log and flush any errors queued in
    the genpool.
    
     [ Based on an original patch, commit message by and completely
       productized by Tony Luck. ]
    
    Fixes: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
    Reported-by: Sumanth Kamatala <skamatala@xxxxxxxxxxx>
    Signed-off-by: Borislav Petkov <bp@xxxxxxx>
    Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
    Signed-off-by: Borislav Petkov <bp@xxxxxxx>
    Link: https://lkml.kernel.org/r/20210824003129.GA1642753@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 0607ec4f5091..da9321548f6f 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -265,6 +265,7 @@ enum mcp_flags {
 	MCP_TIMESTAMP	= BIT(0),	/* log time stamp */
 	MCP_UC		= BIT(1),	/* log uncorrected errors */
 	MCP_DONTLOG	= BIT(2),	/* only clear, don't log */
+	MCP_QUEUE_LOG	= BIT(3),	/* only queue to genpool */
 };
 bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
 
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 22791aadc085..8cb7816d03b4 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -817,7 +817,10 @@ log_it:
 		if (mca_cfg.dont_log_ce && !mce_usable_address(&m))
 			goto clear_it;
 
-		mce_log(&m);
+		if (flags & MCP_QUEUE_LOG)
+			mce_gen_pool_add(&m);
+		else
+			mce_log(&m);
 
 clear_it:
 		/*
@@ -1639,10 +1642,12 @@ static void __mcheck_cpu_init_generic(void)
 		m_fl = MCP_DONTLOG;
 
 	/*
-	 * Log the machine checks left over from the previous reset.
+	 * Log the machine checks left over from the previous reset. Log them
+	 * only, do not start processing them. That will happen in mcheck_late_init()
+	 * when all consumers have been registered on the notifier chain.
 	 */
 	bitmap_fill(all_banks, MAX_NR_BANKS);
-	machine_check_poll(MCP_UC | m_fl, &all_banks);
+	machine_check_poll(MCP_UC | MCP_QUEUE_LOG | m_fl, &all_banks);
 
 	cr4_set_bits(X86_CR4_MCE);
 



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux