On Fri, May 21, 2021 at 12:01:54PM +0900, Naoya Horiguchi wrote: > From: Tony Luck <tony.luck@xxxxxxxxx> > > There can be races when multiple CPUs consume poison from the same > page. The first into memory_failure() atomically sets the HWPoison > page flag and begins hunting for tasks that map this page. Eventually > it invalidates those mappings and may send a SIGBUS to the affected > tasks. > > But while all that work is going on, other CPUs see a "success" > return code from memory_failure() and so they believe the error > has been handled and continue executing. > > Fix by wrapping most of the internal parts of memory_failure() in > a mutex. > > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > Reviewed-by: Borislav Petkov <bp@xxxxxxx> Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> -- Oscar Salvador SUSE L3