On 14.08.19 01:14, Alexander Duyck wrote: > On Tue, Aug 13, 2019 at 3:34 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >>>>>> +static int process_free_page(struct page *page, >>>>>> + struct page_reporting_config *phconf, int count) >>>>>> +{ >>>>>> + int mt, order, ret = 0; >>>>>> + >>>>>> + mt = get_pageblock_migratetype(page); >>>>>> + order = page_private(page); >>>>>> + ret = __isolate_free_page(page, order); >>>>>> + >>>> I just started looking into the wonderful world of >>>> isolation/compaction/migration. >>>> >>>> I don't think saving/restoring the migratetype is correct here. AFAIK, >>>> MOVABLE/UNMOVABLE/RECLAIMABLE is just a hint, doesn't mean that e.g., >>>> movable pages and up in UNMOVABLE or ordinary kernel allocations on >>>> MOVABLE. So that shouldn't be an issue - I guess. >>>> >>>> 1. You should never allocate something that is no >>>> MOVABLE/UNMOVABLE/RECLAIMABLE. Especially not, if you have ISOLATE or >>>> CMA here. There should at least be a !is_migrate_isolate_page() check >>>> somewhere >>>> >>>> 2. set_migratetype_isolate() takes the zone lock, so to avoid racing >>>> with isolation code, you have to hold the zone lock. Your code seems to >>>> do that, so at least you cannot race against isolation. >>>> >>>> 3. You could end up temporarily allocating something in the >>>> ZONE_MOVABLE. The pages you allocate are, however, not movable. There >>>> would have to be a way to make alloc_contig_range()/offlining code >>>> properly wait until the pages have been processed. Not sure about the >>>> real implications, though - too many details in the code (I wonder if >>>> Alex' series has a way of dealing with that) >>>> >>>> When you restore the migratetype, you could suddenly overwrite e.g., >>>> ISOLATE, which feels wrong. >>> >>> >>> I was triggering an occasional CPU stall bug earlier, with saving and restoring >>> the migratetype I was able to fix it. >>> But I will further look into this to figure out if it is really required. >>> >> >> You should especially look into handling isolated/cma pages. Maybe that >> was the original issue. Alex seems to have added that in his latest >> series (skipping isolated/cma pageblocks completely) as well. > > So as far as skipping isolated pageblocks, I get the reason for > skipping isolated, but why would we need to skip CMA? I had made the > change I did based on comments you had made earlier. But while working > on some of the changes to address isolation better and looking over > several spots in the code it seems like CMA is already being used as > an allocation fallback for MIGRATE_MOVABLE. If that is the case > wouldn't it make sense to allow pulling pages and reporting them while > they are in the free_list? I was assuming that CMA is also to be skipped because "static int fallbacks[MIGRATE_TYPES][4]" in mm/page_alloc.c doesn't handle CMA at all, meaning we should never fallback to CMA or from CMA to another type - at least when stealing pages from another migratetype. So it smells like MIGRATE_CMA is static -> the area is marked once and will never be converted to something else (except MIGRATE_ISOLATE temporarily). I assume you are talking about gfp_to_alloc_flags()/prepare_alloc_pages(): #ifdef CONFIG_CMA if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE) alloc_flags |= ALLOC_CMA; #endif Yeah, this looks like MOVABLE allocations can fallback to CMA pageblocks. And from what I read, "CMA may use its own migratetype (MIGRATE_CMA) which behaves similarly to ZONE_MOVABLE but can be put in arbitrary places." So I think you are right, it could be that it is safe to temporarily pull out CMA pages (in contrast to isolated pages) - assuming it is fine to have temporary unmovable allocations on them (different discussion). (I am learning about the details as we discuss :) ) The important part would then be to never allocate from the isolated pageblocks and to never overwrite MIGRATE_ISOLATE. -- Thanks, David / dhildenb