On Thu, Oct 23 2014, Joonsoo Kim wrote: > There are two paths to reach core free function of buddy allocator, > __free_one_page(), one is free_one_page()->__free_one_page() and the > other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page(). > Each paths has race condition causing serious problems. At first, this > patch is focused on first type of freepath. And then, following patch > will solve the problem in second type of freepath. > > In the first type of freepath, we got migratetype of freeing page without > holding the zone lock, so it could be racy. There are two cases of this > race. > > 1. pages are added to isolate buddy list after restoring orignal > migratetype > > CPU1 CPU2 > > get migratetype => return MIGRATE_ISOLATE > call free_one_page() with MIGRATE_ISOLATE > > grab the zone lock > unisolate pageblock > release the zone lock > > grab the zone lock > call __free_one_page() with MIGRATE_ISOLATE > freepage go into isolate buddy list, > although pageblock is already unisolated > > This may cause two problems. One is that we can't use this page anymore > until next isolation attempt of this pageblock, because freepage is on > isolate buddy list. The other is that freepage accouting could be wrong > due to merging between different buddy list. Freepages on isolate buddy > list aren't counted as freepage, but ones on normal buddy list are counted > as freepage. If merge happens, buddy freepage on normal buddy list is > inevitably moved to isolate buddy list without any consideration of > freepage accouting so it could be incorrect. > > 2. pages are added to normal buddy list while pageblock is isolated. > It is similar with above case. > > This also may cause two problems. One is that we can't keep these > freepages from being allocated. Although this pageblock is isolated, > freepage would be added to normal buddy list so that it could be > allocated without any restriction. And the other problem is same as > case 1, that it, incorrect freepage accouting. > > This race condition would be prevented by checking migratetype again > with holding the zone lock. Because it is somewhat heavy operation > and it isn't needed in common case, we want to avoid rechecking as much > as possible. So this patch introduce new variable, nr_isolate_pageblock > in struct zone to check if there is isolated pageblock. > With this, we can avoid to re-check migratetype in common case and do > it only if there is isolated pageblock or migratetype is MIGRATE_ISOLATE. > This solve above mentioned problems. > > Changes from v3: > Add one more check in free_one_page() that checks whether migratetype is > MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens. > > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Acked-by: Michal Nazarewicz <mina86@xxxxxxxxxx> > --- > include/linux/mmzone.h | 9 +++++++++ > include/linux/page-isolation.h | 8 ++++++++ > mm/page_alloc.c | 11 +++++++++-- > mm/page_isolation.c | 2 ++ > 4 files changed, 28 insertions(+), 2 deletions(-) -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Michał “mina86” Nazarewicz (o o) ooo +--<mpn@xxxxxxxxxx>--<xmpp:mina86@xxxxxxxxxx>--ooO--(_)--Ooo-- -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html