Re: [RFC 0/2] New MAP_PMEM_AWARE mmap flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 22, 2016 at 10:34:45AM -0500, Jeff Moyer wrote:
> Hi, Dave,
> 
> Dave Chinner <david@xxxxxxxxxxxxx> writes:
> 
> >> Another potential issue is that MAP_PMEM_AWARE is not enough on its
> >> own.  If the filesystem or inode does not support DAX the application
> >> needs to assume page cache semantics.  At a minimum MAP_PMEM_AWARE
> >> requests would need to fail if DAX is not available.
> >
> > They will always still need to call msync()/fsync() to guarantee
> > data integrity, because the filesystem metadata that indexes the
> > data still needs to be committed before data integrity can be
> > guaranteed. i.e. MAP_PMEM_AWARE by itself it not sufficient for data
> > integrity, and so the app will have to be written like any other app
> > that uses page cache based mmap().
> >
> > Indeed, the application cannot even assume that a fully allocated
> > file does not require msync/fsync because the filesystem may be
> > doing things like dedupe, defrag, copy on write, etc behind the back
> > of the application and so file metadata changes may still be in
> > volatile RAM even though the application has flushed it's data.
> 
> Once you hand out a persistent memory mapping, you sure as heck can't
> switch blocks around behind the back of the application.

Yes we can. All we need to do is lock out page faults, invalidate
the mappings, and change the underlying blocks.  The app using mmap
will refault on it's next access, and get the new block mapped into
it's address space.

I'll point to hole punching as an example of how we do these
invalidate/modify operations right now, and we expect them to work
and not result in data corruption. We even have tests (e.g. fsx in
xfstests has all these operations enabled) to make sure it works.

> That aside, let me see if I understand you correctly.
> 
> An application creates a file and writes to every single block in the
> thing, sync's it, closes it.  It then opens it back up, calls mmap with
> this new MAP_DAX flag or on a file system mounted with -o dax, and
> proceeds to access the file using loads and stores.  It persists its
> data by using non-temporal stores, flushing and fencing cpu
> instructions.

The moment the app does a write to the file data, we can no longer
assume the filesystem metadata references to the file data are
durable.

> If I understand you correctly, you're saying that that application is
> not written correctly, because it needs to call fsync to persist
> metadata (that it presumably did not modify).  Is that right?

Yes, though fdatasync() would be sufficient because the app only
modified data.

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]