Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/25/2013 12:45 PM, Chen, Gong wrote:
Usually SCI is employed to handle corrected error, especially
for memory corrected error but in fact SCI still can be used
to handle any error like memory uncorrected error even fatal
error if BIOS enable it. For this kind of situation, it
should be logged, too.

v2 -> v1: make the event record more precisely

Signed-off-by: Chen, Gong <gong.chen@xxxxxxxxxxxxxxx>
---
  arch/x86/kernel/cpu/mcheck/mce-apei.c | 10 +++++++---
  drivers/acpi/apei/ghes.c              |  3 +--
  2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index de8b60a..d137ab8 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -33,6 +33,7 @@
  #include <linux/acpi.h>
  #include <linux/cper.h>
  #include <acpi/apei.h>
+#include <acpi/ghes.h>
  #include <asm/mce.h>

  #include "mce-internal.h"
@@ -41,14 +42,17 @@ void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
  {
  	struct mce m;

-	/* Only corrected MC is reported */
-	if (!corrected || !(mem_err->validation_bits & CPER_MEM_VALID_PA))
+	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
  		return;

  	mce_setup(&m);
  	m.bank = 1;
-	/* Fake a memory read corrected error with unknown channel */
+	/* Fake a memory read error with unknown channel */
  	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
+	if (corrected >= GHES_SEV_RECOVERABLE)
+		m.status |= MCI_STATUS_UC;
+	if (corrected >= GHES_SEV_PANIC)
+		m.status |= MCI_STATUS_PCC;

Hmm... so you only fill up the most basic information from the cper record. In the absence of 'S', 'AR' bits, I am not sure how useful this is - except for logging the error through /dev/mcelog for legacy users. If that is the intent, you have my

Acked-by: Naveen N. Rao <naveen.n.rao@xxxxxxxxxxxxxxxxxx>


- Naveen

  	m.addr = mem_err->physical_addr;
  	mce_log(&m);
  	mce_notify_irq();
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index a30bc31..ce3683d 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -453,8 +453,7 @@ static void ghes_do_proc(struct ghes *ghes,
  			ghes_edac_report_mem_error(ghes, sev, mem_err);

  #ifdef CONFIG_X86_MCE
-			apei_mce_report_mem_error(sev == GHES_SEV_CORRECTED,
-						  mem_err);
+			apei_mce_report_mem_error(sev, mem_err);
  #endif
  			ghes_handle_memory_failure(gdata, sev);
  		}


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux