On Mon, Feb 1, 2021 at 12:17 AM Aili Yao <yaoaili@xxxxxxxxxxxx> wrote: > > When one page is already hwpoisoned by AO action, process may not be > killed, the process mapping this page may make a syscall include this > page and result to trigger a VM_FAULT_HWPOISON fault, if it's in kernel > mode it may be fixed by fixup_exception. Current code will just return > error code to user process. > > This is not sufficient, we should send a SIGBUS to the process and log > the info to console, as we can't trust the process will handle the error > correctly. Does this happen when one process gets SIGBUSed due to memory failure and another process independently hits the poisoned memory? I'm not entirely convinced that this is a problem. In any case, this patch needs rebasing on top of my big fault series -- as it stands, it's way too difficult to keep track of which paths even call your new code.. And the various signal paths need to be consolidated -- we already have three of them, and the last thing we need is a fourth.