On Thu, May 04, 2023 at 10:27:50PM +0100, Lorenzo Stoakes wrote: > Writing to file-backed mappings which require folio dirty tracking using > GUP is a fundamentally broken operation, as kernel write access to GUP > mappings do not adhere to the semantics expected by a file system. > > A GUP caller uses the direct mapping to access the folio, which does not > cause write notify to trigger, nor does it enforce that the caller marks > the folio dirty. > > The problem arises when, after an initial write to the folio, writeback > results in the folio being cleaned and then the caller, via the GUP > interface, writes to the folio again. > > As a result of the use of this secondary, direct, mapping to the folio no > write notify will occur, and if the caller does mark the folio dirty, this > will be done so unexpectedly. > > For example, consider the following scenario:- > > 1. A folio is written to via GUP which write-faults the memory, notifying > the file system and dirtying the folio. > 2. Later, writeback is triggered, resulting in the folio being cleaned and > the PTE being marked read-only. > 3. The GUP caller writes to the folio, as it is mapped read/write via the > direct mapping. > 4. The GUP caller, now done with the page, unpins it and sets it dirty > (though it does not have to). > > This change updates both the PUP FOLL_LONGTERM slow and fast APIs. As > pin_user_pages_fast_only() does not exist, we can rely on a slightly > imperfect whitelisting in the PUP-fast case and fall back to the slow case > should this fail. [snip] As discussed at LSF/MM, on the flight over I wrote a little repro [0] which reliably triggers the ext4 warning by recreating the scenario described above, using a small userland program and kernel module. This code is not perfect (plane code :) but does seem to do the job adequately, also obviously this should only be run in a VM environment where data loss is acceptable (in my case a small qemu instance). Hopefully this is useful in some way. Note that I explicitly use pin_user_pages() without FOLL_LONGTERM here in order to not run into the mitigation this very patch series provides! Obviously if you revert this series you can see the same happening with FOLL_LONGTERM set. I have licensed the code as GPLv2 so anybody's free to do with it as they will if it's useful in any way! [0]:https://github.com/lorenzo-stoakes/gup-repro