On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > Page state checks are racy. Under a heavy memory workload (e.g. stress > -m 200 -t 2h) it is quite easy to hit a race window when the page is > allocated but its state is not fully populated yet. A debugging patch to > dump the struct page state shows > : [ 476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0 > : [ 476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1 > : [ 476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked) > > Note that the state has been checked for both PageLRU and PageSwapBacked > already. Closing this race completely would require some sort of retry > logic. This can be tricky and error prone (think of potential endless > or long taking loops). > > Workaround this problem for movable zones at least. Such a zone should > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make > has_unmovable_pages more robust") has told us that this is not strictly > true though. Bootmem pages should be marked reserved though so we can > move the original check after the PageReserved check. Pages from other > zones are still prone to races but we even do not pretend that memory > hotremove works for those so pre-mature failure doesn't hurt that much. > > Reported-and-tested-by: Baoquan He <bhe@xxxxxxxxxx> > Acked-by: Baoquan He <bhe@xxxxxxxxxx> > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust") > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > --- > > Hi, > this has been reported [1] and we have tried multiple things to address > the issue. The only reliable way was to reintroduce the movable zone > check into has_unmovable_pages. This time it should be safe also for > the bug originally fixed by 15c30bc09085. > > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv > mm/page_alloc.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 863d46da6586..c6d900ee4982 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > if (PageReserved(page)) > goto unmovable; > > + /* > + * If the zone is movable and we have ruled out all reserved > + * pages then it should be reasonably safe to assume the rest > + * is movable. > + */ > + if (zone_idx(zone) == ZONE_MOVABLE) > + continue; > + > /* There is a WARN_ON() in case of failure at the end of the routine, is that triggered when we hit the bug? If we're adding this patch, the WARN_ON needs to go as well. The check seems to be quite aggressive and in a loop that iterates pages, but has nothing to do with the page, did you mean to make the check zone_idx(page_zone(page)) == ZONE_MOVABLE it also skips all checks for pinned pages and other checks Balbir Singh.