2020년 8월 25일 (화) 오후 2:10, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>님이 작성: > > On Tue, 25 Aug 2020 13:59:42 +0900 js1304@xxxxxxxxx wrote: > > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > > > memalloc_nocma_{save/restore} APIs can be used to skip page allocation > > on CMA area, but, there is a missing case and the page on CMA area could > > be allocated even if APIs are used. This patch handles this case to fix > > the potential issue. > > > > Missing case is an allocation from the pcplist. MIGRATE_MOVABLE pcplist > > could have the pages on CMA area so we need to skip it if ALLOC_CMA isn't > > specified. > > > > This patch implements this behaviour by checking allocated page from > > the pcplist rather than skipping an allocation from the pcplist entirely. > > Skipping the pcplist entirely would result in a mismatch between watermark > > check and actual page allocation. And, it requires to break current code > > layering that order-0 page is always handled by the pcplist. I'd prefer > > to avoid it so this patch uses different way to skip CMA page allocation > > from the pcplist. > > > > ... > > > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -3341,6 +3341,22 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone, > > pcp = &this_cpu_ptr(zone->pageset)->pcp; > > list = &pcp->lists[migratetype]; > > page = __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list); > > +#ifdef CONFIG_CMA > > + if (page) { > > + int mt = get_pcppage_migratetype(page); > > + > > + /* > > + * pcp could have the pages on CMA area and we need to skip it > > + * when !ALLOC_CMA. Free all pcplist and retry allocation. > > + */ > > + if (is_migrate_cma(mt) && !(alloc_flags & ALLOC_CMA)) { > > + list_add(&page->lru, &pcp->lists[migratetype]); > > + pcp->count++; > > + free_pcppages_bulk(zone, pcp->count, pcp); > > + page = __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list); > > + } > > + } > > +#endif > > if (page) { > > __count_zid_vm_events(PGALLOC, page_zonenum(page), 1); > > zone_statistics(preferred_zone, zone); > > That's a bunch more code on a very hot path to serve an obscure feature > which has a single obscure callsite. > > Can we instead put the burden on that callsite rather than upon > everyone? For (dumb) example, teach __gup_longterm_locked() to put the > page back if it's CMA and go get another one? Hmm... Unfortunately, it cannot ensure that we eventually get the non-CMA page. I think that the only way to ensure it is to implement the functionality here. We can use 'unlikely' or 'static branch' to reduce the overhead for a really rare case but for now I have no idea how to completely remove the overhead. Thanks.