Re: [PATCH v2 1/2] cper, apei, mce: Pass x86 CPER through the MCA handling chain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Smita,

A couple of comments below -

Smita Koralahalli <Smita.KoralahalliChannabasappa@xxxxxxx> writes:

> Linux Kernel uses ACPI Boot Error Record Table (BERT) to report fatal
> errors that occurred in a previous boot. The MCA errors in the BERT are
> reported using the x86 Processor Error Common Platform Error Record (CPER)
> format. Currently, the record prints out the raw MSR values and AMD relies
> on the raw record to provide MCA information.
>
> Extract the raw MSR values of MCA registers from the BERT and feed it into
> the standard mce_log() function through the existing x86/MCA RAS
> infrastructure. This will result in better decoding from the EDAC MCE
> decoder or the default notifier.
>
> The implementation is SMCA specific as the raw MCA register values are
> given in the register offset order of the MCAX address space.
>
> Reported-by: kernel test robot <lkp@xxxxxxxxx>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@xxxxxxx>
> ---

[...]


> diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
> index 2531de49f56c..374b8e18552a 100644
> --- a/drivers/firmware/efi/cper-x86.c
> +++ b/drivers/firmware/efi/cper-x86.c
> @@ -1,7 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  // Copyright (C) 2018, Advanced Micro Devices, Inc.
>  
> -#include <linux/cper.h>

Why is the include dropped? AFAICT, the definitions from there are still
being used after this patch.

> +#include <acpi/apei.h>
>  
>  /*
>   * We don't need a "CPER_IA" prefix since these are all locally defined.
> @@ -347,9 +347,11 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
>  			       ctx_info->mm_reg_addr);
>  		}
>  
> -		printk("%sRegister Array:\n", newpfx);
> -		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize,
> -			       (ctx_info + 1), ctx_info->reg_arr_size, 0);
> +		if (arch_apei_report_x86_error(ctx_info, proc->lapic_id)) {
> +			printk("%sRegister Array:\n", newpfx);
> +			print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize,
> +				       (ctx_info + 1), ctx_info->reg_arr_size, 0);
> +		}
>  
>  		ctx_info = (struct cper_ia_proc_ctx *)((long)ctx_info + size);
>  	}
> diff --git a/include/acpi/apei.h b/include/acpi/apei.h
> index 680f80960c3d..44d4d08acce0 100644
> --- a/include/acpi/apei.h
> +++ b/include/acpi/apei.h
> @@ -33,8 +33,15 @@ extern bool ghes_disable;
>  
>  #ifdef CONFIG_ACPI_APEI
>  void __init acpi_hest_init(void);
> +int arch_apei_report_x86_error(struct cper_ia_proc_ctx *ctx_info,
> +			       u64 lapic_id);
>  #else
>  static inline void acpi_hest_init(void) { return; }
> +static inline int arch_apei_report_x86_error(struct cper_ia_proc_ctx *ctx_info,
> +					     u64 lapic_id)
> +{
> +	return -EINVAL;
> +}
>  #endif

Adding the declaration to this include violates the separation of
generic and architecture specific code.

Can this be moved to the appropriate architecture specific header?
Perhaps arch/x86/include/asm/apei.h.

>  typedef int (*apei_hest_func_t)(struct acpi_hest_header *hest_hdr, void *data);
> @@ -51,6 +58,8 @@ int erst_clear(u64 record_id);
>  
>  int arch_apei_enable_cmcff(struct acpi_hest_header *hest_hdr, void *data);
>  void arch_apei_report_mem_error(int sev, struct cper_sec_mem_err *mem_err);
> +int arch_apei_report_x86_error(struct cper_ia_proc_ctx *ctx_info,
> +			       u64 lapic_id);


Why is the additional declaration needed?

Thanks,
Punit

>  
>  #endif
>  #endif



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux