[PATCH 00/13] Fix the DAX-gup mistake

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



tl;dr: Move the pin of 'struct dev_pagemap' instances from gup-time to
map time, move the unpin of 'struct dev_pagemap' to truncate_inode_pages()
for fsdax and devdax inodes, and use page_maybe_dma_pinned() to
determine when filesystems can safely truncate DAX mappings vs DMA.

The longer story is that DAX has caused friction with folio development
and other device-memory use cases due to its hack of using a
page-reference count of 1 to indicate that the page is DMA idle. That
situation arose from the mistake of not managing DAX page reference
counts at map time. The lack of page reference counting at map time grew
organically from the original DAX experiment of attempting to manage DAX
mappings without page structures. The page lock, dirty tracking and
other entry management was supported sans pages. However, the page
support was then bolted on incrementally so solve problems with gup,
memory-failure, and all the other kernel services that are missing when
a pfn does not have an associated page structure.

Since then John has led an effort to account for when a page is pinned
for DMA vs other sources that elevate the reference count. The
page_maybe_dma_pinned() helper slots in seamlessly to replace the need
to track transitions to page->_refount == 1.

The larger change in this set comes from Jason's observation that
inserting DAX mappings without any reference taken is a bug. So
dax_insert_entry(), that fsdax uses, is updated to take 'struct
dev_pagemap' references, and devdax is updated to reuse the same.

This allows for 'struct dev_pagemap' manipulation to be self-contained
to DAX-specific paths. It is also a foundation to build towards removing
pte_devmap() and start treating DAX pages as another vm_normal_page(),
and perhaps more conversions of the DAX infrastructure to reuse typical
page mapping helpers. One of the immediate hurdles is the usage of
pmd_devmap() to distinguish large page mappings that are not transparent
huge pages.

---

Dan Williams (13):
      fsdax: Rename "busy page" to "pinned page"
      fsdax: Use page_maybe_dma_pinned() for DAX vs DMA collisions
      fsdax: Delete put_devmap_managed_page_refs()
      fsdax: Update dax_insert_entry() calling convention to return an error
      fsdax: Cleanup dax_associate_entry()
      fsdax: Rework dax_insert_entry() calling convention
      fsdax: Manage pgmap references at entry insertion and deletion
      devdax: Minor warning fixups
      devdax: Move address_space helpers to the DAX core
      dax: Prep dax_{associate,disassociate}_entry() for compound pages
      devdax: add PUD support to the DAX mapping infrastructure
      devdax: Use dax_insert_entry() + dax_delete_mapping_entry()
      mm/gup: Drop DAX pgmap accounting


 .clang-format             |    1
 drivers/Makefile          |    2
 drivers/dax/Kconfig       |    6
 drivers/dax/Makefile      |    1
 drivers/dax/bus.c         |    2
 drivers/dax/dax-private.h |    1
 drivers/dax/device.c      |   73 ++-
 drivers/dax/mapping.c     | 1020 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/dax/super.c       |    2
 fs/dax.c                  | 1049 ++-------------------------------------------
 fs/ext4/inode.c           |    9
 fs/fuse/dax.c             |   10
 fs/xfs/xfs_file.c         |    8
 fs/xfs/xfs_inode.c        |    2
 include/linux/dax.h       |  124 ++++-
 include/linux/huge_mm.h   |   23 -
 include/linux/memremap.h  |   24 +
 include/linux/mm.h        |   58 +-
 mm/gup.c                  |   92 +---
 mm/huge_memory.c          |   54 --
 mm/memremap.c             |   31 -
 mm/swap.c                 |    2
 22 files changed, 1326 insertions(+), 1268 deletions(-)
 create mode 100644 drivers/dax/mapping.c

base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux