>>>> +static int process_free_page(struct page *page, >>>> + struct page_reporting_config *phconf, int count) >>>> +{ >>>> + int mt, order, ret = 0; >>>> + >>>> + mt = get_pageblock_migratetype(page); >>>> + order = page_private(page); >>>> + ret = __isolate_free_page(page, order); >>>> + >> I just started looking into the wonderful world of >> isolation/compaction/migration. >> >> I don't think saving/restoring the migratetype is correct here. AFAIK, >> MOVABLE/UNMOVABLE/RECLAIMABLE is just a hint, doesn't mean that e.g., >> movable pages and up in UNMOVABLE or ordinary kernel allocations on >> MOVABLE. So that shouldn't be an issue - I guess. >> >> 1. You should never allocate something that is no >> MOVABLE/UNMOVABLE/RECLAIMABLE. Especially not, if you have ISOLATE or >> CMA here. There should at least be a !is_migrate_isolate_page() check >> somewhere >> >> 2. set_migratetype_isolate() takes the zone lock, so to avoid racing >> with isolation code, you have to hold the zone lock. Your code seems to >> do that, so at least you cannot race against isolation. >> >> 3. You could end up temporarily allocating something in the >> ZONE_MOVABLE. The pages you allocate are, however, not movable. There >> would have to be a way to make alloc_contig_range()/offlining code >> properly wait until the pages have been processed. Not sure about the >> real implications, though - too many details in the code (I wonder if >> Alex' series has a way of dealing with that) >> >> When you restore the migratetype, you could suddenly overwrite e.g., >> ISOLATE, which feels wrong. > > > I was triggering an occasional CPU stall bug earlier, with saving and restoring > the migratetype I was able to fix it. > But I will further look into this to figure out if it is really required. > You should especially look into handling isolated/cma pages. Maybe that was the original issue. Alex seems to have added that in his latest series (skipping isolated/cma pageblocks completely) as well. >> [...] >>> So as per your comments in the cover page, the two functions above >>> should also probably be plugged into the zone resizing logic somewhere >>> so if a zone is resized the bitmap is adjusted. >>> >>>> +/** >>>> + * zone_reporting_init - For each zone initializes the page reporting fields >>>> + * and allocates the respective bitmap. >>>> + * >>>> + * This function returns 0 on successful initialization, -ENOMEM otherwise. >>>> + */ >>>> +static int zone_reporting_init(void) >>>> +{ >>>> + struct zone *zone; >>>> + int ret; >>>> + >>>> + for_each_populated_zone(zone) { >>>> +#ifdef CONFIG_ZONE_DEVICE >>>> + /* we can not report pages which are not in the buddy */ >>>> + if (zone_idx(zone) == ZONE_DEVICE) >>>> + continue; >>>> +#endif >>> I'm pretty sure this isn't needed since I don't think the ZONE_DEVICE >>> zone will be considered "populated". >>> >> I think you are right (although it's confusing, we will have present >> sections part of a zone but the zone has no present_pages - screams like >> a re factoring - leftover from ZONE_DEVICE introduction). > > > I think in that case it is safe to have this check here. > What do you guys suggest? If it's not needed, I'd say drop it (eventually add a comment). -- Thanks, David / dhildenb