On Mon, May 09, 2022 at 05:04:54PM +0800, Miaohe Lin wrote: > >> So that leaves us with either > >> > >> 1) Fail offlining -> no need to care about reonlining > > Maybe fail offlining will be a better alternative as we can get rid of many races > between memory failure and memory offline? But no strong opinion. :) If taking care of those races is not an herculean effort, I'd go with allowing offlining + disallow re-onlining. Mainly because memory RAS stuff. Now, to the re-onlining thing, we'll have to come up with a way to check whether a section contains hwpoisoned pages, so we do not have to go and check every single page, as that will be really suboptimal. -- Oscar Salvador SUSE Labs