This patch includes Ralph Campbell’s ZONE_DEVICE refcount cleanup and additionally the changes necessary to avoid breaking the separately submitted MEMORY_DEVICE_COHERENT page migration code. Ralph’s original description: ZONE_DEVICE struct pages have an extra reference count that complicates the code for put_page() and several places in the kernel that need to check the reference count to see that a page is not being used (gup, compaction, migration, etc.). Clean up the code so the reference count doesn't need to be treated specially for ZONE_DEVICE. Following a suggestion by Christoph, we attempted to combine this cleanup with the device patch migration patch series, however this caused xftests 413 to fail with a warning, the root cause of which has large kernel footprint than just device memory. Fixing this issue properly will require cooperation between multiple development groups working across multiple kernel subsystems, as is apparent from the discussion under the earlier, combined patch submission. We therefore propose to break this work out separately as its own patch, so it can receive the cooperative development work it needs. The deep problem arises from the get_user_pages API, which has proved troublesome for many years. It is possible that a concerted effort to repair this particular refcount issue properly will be a step in the direction of rationalizing this popular and problematic API. In the larger picture, this API rationalization work probably deserves an agenda item at the upcoming Filesystem, MM & BPF Summit: https://events.linuxfoundation.org/lsfmm/ The wide ranging discussion following previous iterations of the migration patchset focused almost exclusively on the refcount cleanup patch. The thread is here: https://lore.kernel.org/linux-mm/20211014153928.16805-3-alex.sierra@xxxxxxx/ and links a number of previous threads. It is apparent that there is a lot of work in progress by a number of developer groups in parallel, and that there are issues with the order in which this work should be attempted and merged. Jason provided his list of “balls in the air”: - Joao's compound page support for device_dax and more - Alex's DEVICE_COHERENT - The refcount normalization - Removing the pgmap test from GUP - Removing the need for the PUD/PMD/PTE special bit - Removing the need for the PUD/PMD/PTE devmap bit - Remove PUD/PMD vma_is_special - folios for fsdax - shootdown for fsdax It is not clear that the refcount cleanup in this patch should be the first item on the list to be merged, however it has proved to be a good starting point for a cooperative effort to address the underlying issues. Ralph, if you would prefer to take back “ownership” of this patch, it’s yours, otherwise we will be happy to keep it in play and get it merged when some other pieces of the puzzle fall into place. Ralph Campbell (2): ext4/xfs: add page refcount helper mm: remove extra ZONE_DEVICE struct page refcount arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 +- drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- fs/dax.c | 8 +-- fs/ext4/inode.c | 5 +- fs/fuse/dax.c | 4 +- fs/xfs/xfs_file.c | 4 +- include/linux/dax.h | 10 ++++ include/linux/memremap.h | 7 ++- include/linux/mm.h | 11 ---- lib/test_hmm.c | 2 +- mm/internal.h | 7 +++ mm/memcontrol.c | 6 +- mm/memremap.c | 72 +++++++----------------- mm/migrate.c | 5 -- mm/page_alloc.c | 3 + mm/swap.c | 45 ++------------- 17 files changed, 62 insertions(+), 134 deletions(-) -- 2.32.0