v1: AMD is building a system architecture for the Frontier supercomputer with a coherent interconnect between CPUs and GPUs. This hardware architecture allows the CPUs to coherently access GPU device memory. We have hardware in our labs and we are working with our partner HPE on the BIOS, firmware and software for delivery to the DOE. The system BIOS advertises the GPU device memory (aka VRAM) as SPM (special purpose memory) in the UEFI system address map. The amdgpu driver registers the memory with devmap as MEMORY_DEVICE_PUBLIC using devm_memremap_pages. This patch series adds MEMORY_DEVICE_PUBLIC, which is similar to MEMORY_DEVICE_GENERIC in that it can be mapped for CPU access, but adds support for migrating this memory similar to MEMORY_DEVICE_PRIVATE. We also included and updated two patches from Ralph Campbell (Nvidia), which change ZONE_DEVICE reference counting as requested in previous reviews of this patch series (see https://patchwork.freedesktop.org/series/90706/). Finally, we extended hmm_test to cover migration of MEMORY_DEVICE_PUBLIC. This work is based on HMM and our SVM memory manager, which has landed in Linux 5.14 recently. v2: Major changes on this version: Fold patches: 'mm: call pgmap->ops->page_free for DEVICE_PUBLIC' and 'mm: add public type support to migrate_vma helpers' into 'mm: add zone device public type memory support' Condition added at migrate_vma_collect_pmd, for migrations from device public pages. Making sure pages are from device zone and with the proper MIGRATE_VMA_SELECT_DEVICE_PUBLIC flag. Patch: 'mm: add device public vma selection for memory migration' Fix logic in 'drm/amdkfd: add SPM support for SVM' to detect error in both DEVICE_PRIVATE and DEVICE_PUBLIC. Minor changes: Swap patch order 03 and 04. Addings Add VM_BUG_ON_PAGE(page_ref_count(page), page) to patch 'drm/amdkfd: ref count init for device pages', to make sure page hasn't been used Alex Sierra (10): mm: add zone device public type memory support mm: add device public vma selection for memory migration drm/amdkfd: ref count init for device pages drm/amdkfd: add SPM support for SVM drm/amdkfd: public type as sys mem on migration to ram lib: test_hmm add ioctl to get zone device type lib: test_hmm add module param for zone device type lib: add support for device public type in test_hmm tools: update hmm-test to support device public type tools: update test_hmm script to support SP config Ralph Campbell (2): ext4/xfs: add page refcount helper mm: remove extra ZONE_DEVICE struct page refcount arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 40 ++-- drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- fs/dax.c | 8 +- fs/ext4/inode.c | 5 +- fs/fuse/dax.c | 4 +- fs/xfs/xfs_file.c | 4 +- include/linux/dax.h | 10 + include/linux/memremap.h | 15 +- include/linux/migrate.h | 1 + include/linux/mm.h | 19 +- lib/test_hmm.c | 247 +++++++++++++++-------- lib/test_hmm_uapi.h | 16 ++ mm/internal.h | 8 + mm/memcontrol.c | 6 +- mm/memory-failure.c | 6 +- mm/memremap.c | 71 ++----- mm/migrate.c | 33 +-- mm/page_alloc.c | 3 + mm/swap.c | 45 +---- tools/testing/selftests/vm/hmm-tests.c | 142 +++++++++++-- tools/testing/selftests/vm/test_hmm.sh | 20 +- 22 files changed, 451 insertions(+), 256 deletions(-) -- 2.32.0