Re: [PATCH v2 00/15] btrfs dax support

Goldwyn Rodrigues <rgoldwyn@xxxxxxx> · Wed, 27 Mar 2019 18:26:35 -0500

On 21:14 27/03, Adam Borowski wrote:
> On Tue, Mar 26, 2019 at 02:09:08PM -0500, Goldwyn Rodrigues wrote:
> > This patch set adds support for dax on the BTRFS filesystem.
> 
> This patchset doesn't seem to support MAP_SYNC, which is the usual way to
> use (and detect) DAX.  Basically, it requests for page faults to be
> synchronous -- ie, when a page fault returns, the mapping points to actual
> memory rather than to some buffer that'll be written back to the destination
> at some point in the future.

The translation (in different flags/returns) goes as follows
MAP_SYNC -> VM_SYNC -> VM_NEEDDSYNC.
So, when dax_iomap_fault() returns, it is handled through
dax_finish_sync_fault(). This is how all filesystems are doing it currently.
Refer patch 09/15.

> 
> Also, not really understanding these parts of the kernel, I can't tell if
> the snapshots are atomic.  Ie, while the kernel walks over pages to set
> mprotect flags, the process does two writes:
>    RRRRRRRRRRRRRRRRRRRWWWWWWWWWWWWWWWWWWWWWW (R=ro W=rw)
>         A                       B
> The write at A causes a page fault, which clones the page, CoWing it and
> letting the write into only one of the replicas.  After this, write to B
> happens before the mprotect, thus goes into both replicas -- and despite
> the process having issued proper memory barriers, the other replica has
> B but not A.  To fix this, earlier page faults can't get finalized until
> all mprotects are in place.  (I'm writing this as a query rather than a
> problem report -- I'm an ignoramus here.)

When you initiate a snapshot, btrfs forces everything to CoW until
snapshot finishes. This guarantees all new allocations
are Cow, even if the extent is set to nocow. During this time, all
"writebacks" happen. We don't have writebacks in DAX, but we take this
opportunity to wrprotect the mmap'd pages.
For more details, refer to patch 15/15 in the series.

-- 
Goldwyn