On 6/25/22 5:11 AM, Christoph Hellwig wrote:
On Fri, Jun 24, 2022 at 03:07:50PM +0200, Jan Kara wrote:
I'm not sure I get the context 100% right but pages getting randomly dirty
behind filesystem's back can still happen - most commonly with RDMA and
similar stuff which calls set_page_dirty() on pages it has got from
pin_user_pages() once the transfer is done. page_maybe_dma_pinned() should
be usable within filesystems to detect such cases and protect the
filesystem but so far neither me nor John Hubbart has got to implement this
in the generic writeback infrastructure + some filesystem as a sample case
others could copy...
Well, so far the strategy elsewhere seems to be to just ignore pages
only dirtied through get_user_pages. E.g. iomap skips over pages
reported as holes, and ext4_writepage complains about pages without
buffers and then clears the dirty bit and continues.
I'm kinda surprised that btrfs wants to treat this so special
especially as more of the btrfs page and sub-page status will be out
of date as well.
As Sterba points out later in the thread, btrfs cares more because of
stable page requirements to protect data during COW and to make sure the
crcs we write to disk are correct.
The fixup worker path is pretty easy to trigger if you O_DIRECT reads
into mmap'd pages. You need some memory pressure to power through
get_user_pages trying to do the right thing, but it does happen.
I'd love a proper fix for this on the *_user_pages() side where
page_mkwrite() style notifications are used all the time. It's just a
huge change, and my answer so far has always been that using btrfs
mmap'd memory for this kind of thing isn't a great choice either way.
-chris