On Wed, May 15, 2024 at 3:54 AM Oscar Salvador <osalvador@xxxxxxx> wrote: > > On Wed, May 15, 2024 at 12:41:42PM +0200, Borislav Petkov wrote: > > On Fri, May 10, 2024 at 11:29:26AM -0700, Axel Rasmussen wrote: > > > @@ -3938,7 +3938,7 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) > > > > > > /* Higher priority than uffd-wp when data corrupted */ > > > if (marker & PTE_MARKER_POISONED) > > > - return VM_FAULT_HWPOISON; > > > + return VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_SILENT; > > > > If you know here that this poisoning should be silent, why do you have > > to make it all complicated and propagate it into arch code, waste > > a separate VM_FAULT flag just for that instead of simply returning here > > a VM_FAULT_COMPLETED or some other innocuous value which would stop > > processing the fault? > > AFAIK, He only wants it to be silent wrt. the arch fault handler not screaming, > but he still wants to be able to trigger force_sig_mceerr(). Right, the goal is to still have the process get a SIGBUS, but to avoid the "MCE error" log message. The basic issue is, unprivileged users can set these markers up, and thereby completely spam up the log. Also since this is a process-specific thing, and it's not a real hardware poison event, it's unclear system admins care at all at a global level (this is why we didn't want to switch to just printk_ratelimited for example). Better to let the process handle the SIGBUS however it likes for its use case (logging a message elsewhere, etc.). That said, one thing I'm not sure about is whether or not VM_FAULT_SIGBUS is a viable alternative (returned for a new PTE marker type specific to simulated poison). The goal of the simulated poison feature is to "closely simulate" a real hardware poison event. If you live migrate a VM from a host with real poisoned memory, to a new host: you'd want to keep the same behavior if the guest accessed those addresses again, so as not to confuse the guest about why it suddenly became "un-poisoned". At a basic level I think VM_FAULT_SIGBUS gives us what we want (send SIGBUS to the process, don't log about MCEs), but I'm not confident I know all the differences vs. VM_FAULT_HWPOISON on all the arches. > > > -- > Oscar Salvador > SUSE Labs