On Thu, Oct 08, 2020 at 01:42:01PM +0200, Vlastimil Babka wrote: > Memory offline relies on page isolation can race with process freeing pages to > pcplists in a way that a page from isolated pageblock can end up on pcplist. > This can be worked around by repeated draining of pcplists, as done by commit > 968318261221 ("mm/memory_hotplug: drain per-cpu pages again during memory > offline"). > > David and Michal would prefer that this race was closed in a way that callers > of page isolation who need stronger guarantees don't need to repeatedly drain. > David suggested disabling pcplists usage completely during page isolation, > instead of repeatedly draining them. > > To achieve this without adding special cases in alloc/free fastpath, we can use > the same approach as boot pagesets - when pcp->high is 0, any pcplist addition > will be immediately flushed. > > The race can thus be closed by setting pcp->high to 0 and draining pcplists > once, before calling start_isolate_page_range(). The draining will serialize > after processes that already disabled interrupts and read the old value of > pcp->high in free_unref_page_commit(), and processes that have not yet disabled > interrupts, will observe pcp->high == 0 when they are rescheduled, and skip > pcplists. This guarantees no stray pages on pcplists in zones where isolation > happens. > > This patch thus adds zone_pcp_disable() and zone_pcp_enable() functions that > page isolation users can call before start_isolate_page_range() and after > unisolating (or offlining) the isolated pages. > > Also, drain_all_pages() is optimized to only execute on cpus where pcplists are > not empty. The check can however race with a free to pcplist that has not yet > increased the pcp->count from 0 to 1. Thus make the drain optionally skip the > racy check and drain on all cpus, and use this option in zone_pcp_disable(). > > As we have to avoid external updates to high and batch while pcplists are > disabled, we take pcp_batch_high_lock in zone_pcp_disable() and release it in > zone_pcp_enable(). This also synchronizes multiple users of > zone_pcp_disable()/enable(). > > Currently the only user of this functionality is offline_pages(). > > Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> I definitely like this one, and the implemantion looks smoth. As Michal said in another thread, Hwposion code will also benefit from this, since now we have a drain_all_pages dance that might be suboptimal and not accurate. I will get back to that once this gets merged. Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> Thanks! -- Oscar Salvador SUSE L3