On Fri, Apr 21, 2023 at 11:06:48AM -0400, Johannes Weiner wrote: > On Fri, Apr 21, 2023 at 01:47:44PM +0100, Mel Gorman wrote: > > On Tue, Apr 18, 2023 at 03:12:52PM -0400, Johannes Weiner wrote: > > > Right now, there is only one pcplist for THP allocations. However, > > > while most THPs are movable, the huge zero page is not. This means a > > > movable THP allocation can grab an unmovable block from the pcplist, > > > and a subsequent THP split, partial free, and reallocation of the > > > remainder will mix movable and unmovable pages in the block. > > > > > > While this isn't a huge source of block pollution in practice, it > > > happens often enough to trigger debug warnings fairly quickly under > > > load. In the interest of tightening up pageblock hygiene, make the THP > > > pcplists fully migratetype-aware, just like the lower order ones. > > > > > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > > > Split out :P > > > > Take special care of this one because, while I didn't check this, I > > suspect it'll push the PCP structure size into the next cache line and > > increase overhead. > > > > The changelog makes it unclear why exactly this happens or why the > > patch fixes it. > > Before this, I'd see warnings from the last patch in the series about > received migratetype not matching requested mt. > > The way it happens is that the zero page gets freed and the unmovable > block put on the pcplist. A regular THP allocation is subsequently > served from an unmovable block. > > Mental note, I think this can happen the other way around too: a > regular THP on the pcp being served to a MIGRATE_UNMOVABLE zero > THP. It's not supposed to, but it looks like there is a bug in the > code that's meant to prevent that from happening in rmqueue(): > > if (likely(pcp_allowed_order(order))) { > /* > * MIGRATE_MOVABLE pcplist could have the pages on CMA area and > * we need to skip it when CMA area isn't allowed. > */ > if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA || > migratetype != MIGRATE_MOVABLE) { > page = rmqueue_pcplist(preferred_zone, zone, order, > migratetype, alloc_flags); > if (likely(page)) > goto out; > } > } > > Surely that last condition should be migratetype == MIGRATE_MOVABLE? > It should be. It would have been missed for ages because it would need a test case based on a machine configuration that requires CMA for functional correctness and is using THP which is an unlikely combination. > > The huge zero page strips GFP_MOVABLE (so unmovable) > > but at allocation time, it doesn't really matter what the movable type > > is because it's a full pageblock. It doesn't appear to be a hazard until > > the split happens. Assuming that's the case, it should be ok to always > > set the pageblock movable for THP allocations regardless of GFP flags at > > allocation time or else set the pageblock MOVABLE at THP split (always > > MOVABLE at allocation time makes more sense). > > The regular allocator compaction skips over compound pages anyway, so > the migratetype should indeed not matter there. > > The bigger issue is CMA. alloc_contig_range() will try to move THPs to > free a larger range. We have to be careful not to place an unmovable > zero THP into a CMA region. That means we can not play games with MT - > we really do have to physically keep unmovable and movable THPs apart. > Fair point. > Another option would be not to use pcp for the zero THP. It's cached > anyway in the caller. But it would add branches to the THP alloc and > free fast paths (pcp_allowed_order() also checking migratetype). And this is probably the most straight-forward option. The intent behind caching some THPs on PCP was faulting large mappings of normal THPs and reducing the contention on the zone lock a little. The zero THP is somewhat special because it should not be allocated at high frequency. -- Mel Gorman SUSE Labs