On Tue, Jul 02, 2024 at 12:00:19PM +0800, yangge1116@xxxxxxx wrote: > From: yangge <yangge1116@xxxxxxx> > > Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for > THP-sized allocations") no longer differentiates the migration type of > pages in THP-sized PCP list, it's possible that non-movable allocation > requests may get a CMA page from the list, in some cases, it's not > acceptable. > > If a large number of CMA memory are configured in system (for example, the > CMA memory accounts for 50% of the system memory), starting a virtual > machine with device passthrough will get stuck. During starting the > virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM, > ...) to pin memory. Normally if a page is present and in CMA area, > pin_user_pages_remote() will migrate the page from CMA area to non-CMA > area because of FOLL_LONGTERM flag. But if non-movable allocation > requests return CMA memory, migrate_longterm_unpinnable_pages() will > migrate a CMA page to another CMA page, which will fail to pass the check > in check_and_migrate_movable_pages() and cause migration endless. > > Call trace: > pin_user_pages_remote > --__gup_longterm_locked // endless loops in this function > ----_get_user_pages_locked > ----check_and_migrate_movable_pages > ------migrate_longterm_unpinnable_pages > --------alloc_migration_target > > This problem will also have a negative impact on CMA itself. For example, > when CMA is borrowed by THP, and we need to reclaim it through cma_alloc() > or dma_alloc_coherent(), we must move those pages out to ensure CMA's > users can retrieve that contigous memory. Currently, CMA's memory is > occupied by non-movable pages, meaning we can't relocate them. As a > result, cma_alloc() is more likely to fail. > > To fix the problem above, we add one PCP list for THP, which will not > introduce a new cacheline for struct per_cpu_pages. THP will have 2 PCP > lists, one PCP list is used by MOVABLE allocation, and the other PCP list > is used by UNMOVABLE allocation. MOVABLE allocation contains GPF_MOVABLE, > and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE. > > Link: https://lkml.kernel.org/r/1718845190-4456-1-git-send-email-yangge1116@xxxxxxx > Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations") > Signed-off-by: yangge <yangge1116@xxxxxxx> > Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> > Cc: Barry Song <21cnbao@xxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > (cherry picked from commit bf14ed81f571f8dba31cd72ab2e50fbcc877cc31) > Signed-off-by: yangge <yangge1116@xxxxxxx> > --- > include/linux/mmzone.h | 9 ++++----- > mm/page_alloc.c | 8 ++++++-- > 2 files changed, 10 insertions(+), 7 deletions(-) > All now queued up, thanks. greg k-h