On Sun, Mar 02, 2025 at 03:14:52PM +0800, Shuai Xue wrote: > > > "mce: Uncorrected hardware memory error in user-access at 3b116c400" > > It is the current message in kill_me_maybe(), not added by me. Doesn't change the fact that it is not really helpful when it comes to logging all errors properly. [ Properly means using a structured log format with the tracepoint and not dumping it into dmesg. ] And figuring out what hw is failing so that it can be replaced. No one has come with a real need for making it better, more useful. You're coming with what I think is such a need and I'm trying to explain to you what needs to be done. But you want to feed your AI with dmesg and solve it this way. If you wanna do it right, we can talk. Otherwise, have fun. > 3. We need to identify and implement potential improvements. > > "mce: Uncorrected hardware memory error in user-access at 3b116c400" > > is *nothing* but > > "mce: Action required: data load in error recoverable area of kernel" > > helps. I don't think you've read what I wrote but that's ok. If you think it helps, you can keep it in your kernels. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette