On 9/11/23 17:57, Johannes Weiner wrote: > On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote: >> mm: page_alloc: Free pages to correct buddy list after PCP lock contention >> >> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") >> returns pages to the buddy list on PCP lock contention. However, for >> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have >> been clobbered already for pages that are not being isolated. In >> practice, this means that CMA pages may be returned to the wrong >> buddy list. While this might be harmless in some cases as it is >> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback >> and prevent a future CMA allocation. Lookup the PCP migratetype >> against unconditionally if the PCP lock is contended. >> >> [lecopzer.chen@xxxxxxxxxxxx: CMA-specific fix] >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock") >> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx> >> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >> --- >> mm/page_alloc.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 452459836b71..4053c377fee8 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order) >> free_unref_page_commit(zone, pcp, page, migratetype, order); >> pcp_spin_unlock(pcp); >> } else { >> - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE); >> + /* >> + * The page migratetype may have been clobbered for types >> + * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so >> + * must be rechecked. >> + */ >> + free_one_page(zone, page, pfn, order, >> + get_pcppage_migratetype(page), FPI_NONE); >> } >> pcp_trylock_finish(UP_flags); >> } >> > > I had sent a (similar) fix for this here: > > https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@xxxxxxxxxxx/ > > The context wasn't CMA, but HIGHATOMIC pages going to the movable > freelist. But the class of bug is the same: the migratetype tweaking > really only applies to the pcplist, not the buddy slowpath; I added a > local pcpmigratetype to make it more clear, and hopefully prevent bugs > of this nature down the line. Seems to be the cleanest solution to me, indeed. > I'm just preparing v2 of the above series. Do you want me to break > this change out and send it separately? Works for me, if you combine the it with the information about what commit that fixes, the CMA implications reported, and Cc stable.