Re: [PATCH v2 2/5] x86/mce: dump error msg from severities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2025/3/2 15:37, Borislav Petkov 写道:
On Sun, Mar 02, 2025 at 03:14:52PM +0800, Shuai Xue wrote:
      "mce: Uncorrected hardware memory error in user-access at 3b116c400"

It is the current message in kill_me_maybe(), not added by me.

Doesn't change the fact that it is not really helpful when it comes to logging
all errors properly.

   [ Properly means using a structured log format with the tracepoint and not
     dumping it into dmesg. ]

And figuring out what hw is failing so that it can be replaced. No one has
come with a real need for making it better, more useful.

You're coming with what I think is such a need and I'm trying to explain to
you what needs to be done. But you want to feed your AI with dmesg and solve
it this way.

If you wanna do it right, we can talk. Otherwise, have fun.

I see. So I am just curious why we define `msg` in `severities`?

I perfer to use structured log format with the tracepoint, and we do use it in
production, but it lacks of process context.

AMD folks add error message for panic errors[1] to help debugging
in which the EDAC driver is not able to decode.

For non-fatal errors, is it reasonable to assume that all users are using
tracepoint-based tools like Rasdaemon?

[1]https://lore.kernel.org/all/20220405183212.354606-1-carlos.bilbao@xxxxxxx/


3. We need to identify and implement potential improvements.

"mce: Uncorrected hardware memory error in user-access at 3b116c400"

is *nothing* but

"mce: Action required: data load in error recoverable area of kernel"

helps.

I don't think you've read what I wrote but that's ok. If you think it helps,
you can keep it in your kernels.


Fine, I could drop patch 1 and 2 in next version.

Thanks.
Shuai




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux