Re: [RFC PATCH v1 0/2] Userspace Can Control Memory Failure Recovery

Jiaqi Yan <jiaqiyan@xxxxxxxxxx> · Thu, 3 Oct 2024 16:19:12 -0700

Hi Tony,

On Thu, Oct 3, 2024 at 3:58 PM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> > Are you suggesting you prefer the per-VMA policy, or proposing a new
> > "per-process policy" added via prctl? By "per-process", I imagine the
> > policy to keep or offline the poisoned page will apply to all its
> > VMAs?
>
> A "per-process policy" using prctl already exists. See prctl(PR_MCE_KILL).

The policy I want to have is not about "whether to send SIGBUS or not"
or "when to send SIGBUS", it is about whether to offline the error
[huge]page or keep it accessible by the process.

> Currently used to choose whether to eagerly send SIGBUS to a process
> when a memory error is discovered asynchronously by a h/w patrol scrubber.
>
> What is the use case for a per-VMA policy? Do you have some application
> that would like to use this?

Our main use case is the virtual machine monitor and VM. VMM can track
the *guest* physical addresses that are affected by the *host*
physical addresses having errors. We'd like the VM to be able to
continue loading guest data from the error [huge]page. Loading the
clean portion should just work; loading the poisoned portion will be
intercepted by KVM + VMM without going down to kernel / firmware /
hardware.

>
> -Tony