Re: [PATCH v2 1/4] x86/mce: Add wrapper for struct mce to export vendor specific info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 25, 2024 at 02:56:21PM -0500, Avadhut Naik wrote:
> Currently, exporting new additional machine check error information
> involves adding new fields for the same at the end of the struct mce.
> This additional information can then be consumed through mcelog or
> tracepoint.
> 
> However, as new MSRs are being added (and will be added in the future)
> by CPU vendors on their newer CPUs with additional machine check error
> information to be exported, the size of struct mce will balloon on some
> CPUs, unnecessarily, since those fields are vendor-specific. Moreover,
> different CPU vendors may export the additional information in varying
> sizes.
> 
> The problem particularly intensifies since struct mce is exposed to
> userspace as part of UAPI. It's bloating through vendor-specific data
> should be avoided to limit the information being sent out to userspace.
> 
> Add a new structure mce_hw_err to wrap the existing struct mce. The same
> will prevent its ballooning since vendor-specifc data, if any, can now be
> exported through a union within the wrapper structure and through
> __dynamic_array in mce_record tracepoint.
> 
> Furthermore, new internal kernel fields can be added to the wrapper
> struct without impacting the user space API.
> 
> Note: Some Checkpatch checks have been ignored to maintain coding style.
> 
> [Yazen: Add last commit message paragraph.]
> 
> Suggested-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
> Signed-off-by: Avadhut Naik <avadhut.naik@xxxxxxx>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@xxxxxxx>
> ---
>  arch/x86/include/asm/mce.h              |   6 +-
>  arch/x86/kernel/cpu/mce/amd.c           |  29 ++--
>  arch/x86/kernel/cpu/mce/apei.c          |  54 +++----
>  arch/x86/kernel/cpu/mce/core.c          | 178 +++++++++++++-----------
>  arch/x86/kernel/cpu/mce/dev-mcelog.c    |   2 +-
>  arch/x86/kernel/cpu/mce/genpool.c       |  20 +--
>  arch/x86/kernel/cpu/mce/inject.c        |   4 +-
>  arch/x86/kernel/cpu/mce/internal.h      |   4 +-
>  drivers/acpi/acpi_extlog.c              |   2 +-
>  drivers/acpi/nfit/mce.c                 |   2 +-
>  drivers/edac/i7core_edac.c              |   2 +-
>  drivers/edac/igen6_edac.c               |   2 +-
>  drivers/edac/mce_amd.c                  |   2 +-
>  drivers/edac/pnd2_edac.c                |   2 +-
>  drivers/edac/sb_edac.c                  |   2 +-
>  drivers/edac/skx_common.c               |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c |   2 +-
>  drivers/ras/amd/fmpm.c                  |   2 +-
>  drivers/ras/cec.c                       |   2 +-
>  include/trace/events/mce.h              |  42 +++---
>  20 files changed, 199 insertions(+), 162 deletions(-)

Ok, did some minor massaging but otherwise looks ok now.

Tony, any comments? You ok with this, would that fit any Intel-specific vendor
fields too or do you need some additional Intel-specific changes?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux