On Wed, 3 Jul 2024 20:02:33 +0800 yangge1116@xxxxxxx wrote: > From: yangge <yangge1116@xxxxxxx> > > If a large number of CMA memory are configured in system (for example, the > CMA memory accounts for 50% of the system memory), starting a virtual > virtual machine with device passthrough, it will > call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. > Normally if a page is present and in CMA area, pin_user_pages_remote() > will migrate the page from CMA area to non-CMA area because of > FOLL_LONGTERM flag. But the current code will cause the migration failure > due to unexpected page refcounts, and eventually cause the virtual machine > fail to start. > > If a page is added in LRU batch, its refcount increases one, remove the > page from LRU batch decreases one. Page migration requires the page is not > referenced by others except page mapping. Before migrating a page, we > should try to drain the page from LRU batch in case the page is in it, > however, folio_test_lru() is not sufficient to tell whether the page is > in LRU batch or not, if the page is in LRU batch, the migration will fail. > > To solve the problem above, we modify the logic of adding to LRU batch. > Before adding a page to LRU batch, we clear the LRU flag of the page so > that we can check whether the page is in LRU batch by folio_test_lru(page). > Seems making the LRU flag of the page invisible a long time is no problem, > because a new page is allocated from buddy and added to the lru batch, > its LRU flag is also not visible for a long time. > Thanks. I'll add this to the mm-hotfixes branch for additional testing. Please continue to work with David on the changelog enhancements. In mm-hotfixes I'd expect to send it to Linus next week. I could move it into mm-unstable (then mm-stable) for merging into 6.11-rc1. This is for additional testing time - it will still be backported into earlier kernels. We can do this with any patch.