Re: [PATCH 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 25, 2013 at 11:38:23AM +0530, Naveen N. Rao wrote:
> Date: Mon, 25 Nov 2013 11:38:23 +0530
> From: "Naveen N. Rao" <naveen.n.rao@xxxxxxxxxxxxxxxxxx>
> To: "Chen, Gong" <gong.chen@xxxxxxxxxxxxxxx>
> Cc: tony.luck@xxxxxxxxx, bp@xxxxxxxxx, linux-acpi@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 1/2] ACPI, APEI, GHES: Remove strict check for memory
>  error handling
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> On 2013/11/22 12:57AM, Chen Gong wrote:
> > Usually SCI is employed to handle corrected error, especially
> > for memory corrected error but in fact SCI still can be used
> > to handle any error like memory uncorrected error if BIOS
> > enable it. For this situation, memory uncorrected error
> > should be logged as corrected error does, too.
> > 
> > Signed-off-by: Chen, Gong <gong.chen@xxxxxxxxxxxxxxx>
> > ---
> >  arch/x86/include/asm/mce.h            | 3 +--
> >  arch/x86/kernel/cpu/mcheck/mce-apei.c | 6 ++----
> >  drivers/acpi/apei/ghes.c              | 3 +--
> >  3 files changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> > index cbe6b9e..94b263f 100644
> > --- a/arch/x86/include/asm/mce.h
> > +++ b/arch/x86/include/asm/mce.h
> > @@ -244,7 +244,6 @@ static inline void mcheck_intel_therm_init(void) { }
> >   */
> >  
> >  struct cper_sec_mem_err;
> > -extern void apei_mce_report_mem_error(int corrected,
> > -				      struct cper_sec_mem_err *mem_err);
> > +extern void apei_mce_report_mem_error(struct cper_sec_mem_err *mem_err);
> >  
> >  #endif /* _ASM_X86_MCE_H */
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > index cd8b166..f09da48 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > @@ -37,13 +37,11 @@
> >  
> >  #include "mce-internal.h"
> >  
> > -void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> > +void apei_mce_report_mem_error(struct cper_sec_mem_err *mem_err)
> >  {
> >  	struct mce m;
> >  
> > -	/* Only corrected MC is reported */
> > -	if (!corrected || !(mem_err->validation_bits &
> > -				CPER_MEM_VALID_PHYSICAL_ADDRESS))
> > +	if (!(mem_err->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS))
> >  		return;
> 
> This won't be enough. Further down, you'll see that all memory errors
> get logged as corrected errors due to the hardcoded MCE status. A lot
> more work will be needed if we have to log GHES errors through mcelog
> properly.
> 
> - Naveen
> 
Sure. In fact, the most valuable information from CPER is physical
address and most other data are faked or only little values. But
at least we can make it more precisely.

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux