Re: [RESEND][PATCH 1/3] x86: Add task_struct flag to force SIGBUS on MCE

Andrew Zaborowski <andrew.zaborowski@xxxxxxxxx> · Sat, 10 Aug 2024 05:55:41 +0200

On Sat, 10 Aug 2024 at 05:21, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Sat, Aug 10, 2024 at 03:20:10AM +0200, Andrew Zaborowski wrote:
> > True, though that's hard to link to a specific process crash.
>
> The process name when the MCE gets reported is *actually* there in the
> splat: current->comm.

That's the current process.  The list of processes to be signalled is
determined later and not in a simple way.

>
> > Supporting something generally includes supporting the common and the
> > obscure cases.
>
> Bullshit. We certainly won't support some obscure use cases just
> because.

It's simple reliability, if you support something only sometimes no
one can rely on it.  Without a deep analysis of their kernel code
snapshot at least.

>
> > From the user's point of view the kernel has been committed to
> > supporting these scenarios indefinitely or until the deprecation of
> > the SIGBUS-on-memory-error logic, and simply has a bug.
>
> And lemme repeat my question:
>
> So why does it matter if a process which is being executed and gets an
> MCE beyond the point of no return absolutely needs to return SIGBUS vs
> it getting killed and you still get an MCE logged on the machine, in
> either case?
>
> Bug which is seen by whom or what?

In the case I know of, by the parent process, it's basing some
decision on the signal number and the expected behavior from the
kernel even if not unambiguously documented.

Like I said it can be worked around in userspace, my change doesn't
*enable* the use case.

Best regards