On Thu, Mar 06, 2025 at 03:42:28PM +1100, Balbir Singh wrote: This is an exciting series to see. As of today, we have just merged this series into the DRM subsystem / Xe [2], which adds very basic SVM support. One of the performance bottlenecks we quickly identified was the lack of THP for device pages—I believe our profiling showed that 96% of the time spent on 2M page GPU faults was within the migrate_vma_* functions. Presumably, this will help significantly. We will likely attempt to pull this code into GPU SVM / Xe fairly soon. I believe we will encounter a conflict since [2] includes these patches [3] [4], but we should be able to resolve that. These patches might make it into the 6.15 PR — TBD but I can get back to you on that. I have one question—does this series contain all the required core MM changes for us to give it a try? That is, do I need to include any other code from the list to test this out? Matt [2] https://patchwork.freedesktop.org/series/137870/ [3] https://patchwork.freedesktop.org/patch/641207/?series=137870&rev=8 [4] https://patchwork.freedesktop.org/patch/641214/?series=137870&rev=8 > This patch series adds support for THP migration of zone device pages. > To do so, the patches implement support for folio zone device pages > by adding support for setting up larger order pages. > > These patches build on the earlier posts by Ralph Campbell [1] > > Two new flags are added in vma_migration to select and mark compound pages. > migrate_vma_setup(), migrate_vma_pages() and migrate_vma_finalize() > support migration of these pages when MIGRATE_VMA_SELECT_COMPOUND > is passed in as arguments. > > The series also adds zone device awareness to (m)THP pages along > with fault handling of large zone device private pages. page vma walk > and the rmap code is also zone device aware. Support has also been > added for folios that might need to be split in the middle > of migration (when the src and dst do not agree on > MIGRATE_PFN_COMPOUND), that occurs when src side of the migration can > migrate large pages, but the destination has not been able to allocate > large pages. The code supported and used folio_split() when migrating > THP pages, this is used when MIGRATE_VMA_SELECT_COMPOUND is not passed > as an argument to migrate_vma_setup(). > > The test infrastructure lib/test_hmm.c has been enhanced to support THP > migration. A new ioctl to emulate failure of large page allocations has > been added to test the folio split code path. hmm-tests.c has new test > cases for huge page migration and to test the folio split path. > > The nouveau dmem code has been enhanced to use the new THP migration > capability. > > mTHP support: > > The patches hard code, HPAGE_PMD_NR in a few places, but the code has > been kept generic to support various order sizes. With additional > refactoring of the code support of different order sizes should be > possible. > > References: > [1] https://lore.kernel.org/linux-mm/20201106005147.20113-1-rcampbell@xxxxxxxxxx/ > > These patches are built on top of mm-everything-2025-03-04-05-51 > > Cc: Karol Herbst <kherbst@xxxxxxxxxx> > Cc: Lyude Paul <lyude@xxxxxxxxxx> > Cc: Danilo Krummrich <dakr@xxxxxxxxxx> > Cc: David Airlie <airlied@xxxxxxxxx> > Cc: Simona Vetter <simona@xxxxxxxx> > Cc: "Jérôme Glisse" <jglisse@xxxxxxxxxx> > Cc: Shuah Khan <shuah@xxxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Barry Song <baohua@xxxxxxxxxx> > Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> > Cc: Ryan Roberts <ryan.roberts@xxxxxxx> > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Cc: Peter Xu <peterx@xxxxxxxxxx> > Cc: Zi Yan <ziy@xxxxxxxxxx> > Cc: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> > Cc: Jane Chu <jane.chu@xxxxxxxxxx> > Cc: Alistair Popple <apopple@xxxxxxxxxx> > Cc: Donet Tom <donettom@xxxxxxxxxxxxx> > > Balbir Singh (11): > mm/zone_device: support large zone device private folios > mm/migrate_device: flags for selecting device private THP pages > mm/thp: zone_device awareness in THP handling code > mm/migrate_device: THP migration of zone device pages > mm/memory/fault: Add support for zone device THP fault handling > lib/test_hmm: test cases and support for zone device private THP > mm/memremap: Add folio_split support > mm/thp: add split during migration support > lib/test_hmm: add test case for split pages > selftests/mm/hmm-tests: new tests for zone device THP migration > gpu/drm/nouveau: Add THP migration support > > drivers/gpu/drm/nouveau/nouveau_dmem.c | 244 +++++++++---- > drivers/gpu/drm/nouveau/nouveau_svm.c | 6 +- > drivers/gpu/drm/nouveau/nouveau_svm.h | 3 +- > include/linux/huge_mm.h | 18 +- > include/linux/memremap.h | 29 +- > include/linux/migrate.h | 2 + > include/linux/mm.h | 1 + > lib/test_hmm.c | 387 ++++++++++++++++---- > lib/test_hmm_uapi.h | 3 + > mm/huge_memory.c | 242 +++++++++--- > mm/memory.c | 6 +- > mm/memremap.c | 50 ++- > mm/migrate.c | 2 + > mm/migrate_device.c | 488 +++++++++++++++++++++---- > mm/page_alloc.c | 1 + > mm/page_vma_mapped.c | 10 + > mm/rmap.c | 19 +- > tools/testing/selftests/mm/hmm-tests.c | 407 +++++++++++++++++++++ > 18 files changed, 1630 insertions(+), 288 deletions(-) > > -- > 2.48.1 >