[LSF/MM/BPF TOPIC] FOLL_PIN + file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

By the time we meet for LSF/MM/BPF in May, the Direct IO layer will
likely be converted to use FOLL_PIN page pinning (that is, changed from
using get_user_pages_fast(), to pin_user_pages_fast()).

Direct IO conversion to FOLL_PIN was the last missing piece, and so the
time is right to discuss how to use the output of all of this work
(which is: the ability to call page_maybe_dma_pinned()), in order to fix
one of the original problems that prompted FOLL_PIN's creation.

That problem is: file systems do not currently tolerate having their
pages pinned and DMA'd to/from. See [1] for an extensive background of
some 11 LWN articles since 2018.

In order to fix that problem, now that FOLL_PIN is available and wired
up, we should revisit some of the leading proposals. For example, using
file leases to mark out areas that are safe for FOLL_PIN page pins, is
probably about right--with enough caveats to avoid breaking things for
existing users...this is worth discussing at the conference. This topic
has been contentious in past sessions, but with the recent progress, it
should be a little easier to make progress, because there are fewer
"variables".

I'll volunteer to present a few slides to provide the background and get
the discussion started. It's critical to have filesystem people in
attendance for this, such as Jan Kara, Dave Chinner, Christoph Hellwig,
and many more that I won't try to list explicitly here. RDMA
representation (Jason Gunthorpe, Leon Romanovsky, Chaitanya Kulkarni,
and others) will help keep the file system folks from creating rules
that break them "too much". And of course -mm folks. There are many
people who have contributed to this project, so again, apologies for not
listing everyone explicitly.

Worth mentioning: in addition to the Direct IO work, FOLL_PIN has been
used to help improve the accuracy of copy-on-write (COW) behavior [2].
And that eventually led to some speculation on Linus' part [3], that
"legacy gup" (wow, someone already said those words!) might eventually
be rare, and that most callers of the mm/gup.c functions would actually
want access to the pages' contents for DMA or Direct IO. Those cases
require calling the FOLL_PIN (pin_user_pages*) variants.


[1] https://lwn.net/Kernel/Index/#Memory_management-get_user_pages

[2] https://lwn.net/Articles/849876/ Patching until the COWs come home
    (part 2)

[3] https://lore.kernel.org/r/CAHk-=whUEZC2skXPUWy93DpNmC0VF=t2EwmEgWGx8aPstTmWYA@xxxxxxxxxxxxxx


thanks,
--
John Hubbard
NVIDIA



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux