Re: [RFC 0/2] New MAP_PMEM_AWARE mmap flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Dave,

Dave Chinner <david@xxxxxxxxxxxxx> writes:

>> Another potential issue is that MAP_PMEM_AWARE is not enough on its
>> own.  If the filesystem or inode does not support DAX the application
>> needs to assume page cache semantics.  At a minimum MAP_PMEM_AWARE
>> requests would need to fail if DAX is not available.
>
> They will always still need to call msync()/fsync() to guarantee
> data integrity, because the filesystem metadata that indexes the
> data still needs to be committed before data integrity can be
> guaranteed. i.e. MAP_PMEM_AWARE by itself it not sufficient for data
> integrity, and so the app will have to be written like any other app
> that uses page cache based mmap().
>
> Indeed, the application cannot even assume that a fully allocated
> file does not require msync/fsync because the filesystem may be
> doing things like dedupe, defrag, copy on write, etc behind the back
> of the application and so file metadata changes may still be in
> volatile RAM even though the application has flushed it's data.

Once you hand out a persistent memory mapping, you sure as heck can't
switch blocks around behind the back of the application.

But even if we're not dealing with persistent memory, you seem to imply
that applications needs to fsync just in case the file system did
something behind its back.  In other words, an application opening a
fully allocated file and using fdatasync will also need to call fsync,
just in case.  Is that really what you're suggesting?

> Applications have no idea what the underlying filesystem and storage
> is doing and so they cannot assume that complete data integrity is
> provided by userspace driven CPU cache flush instructions on their
> file data.

This is surprising to me, and goes completely against the proposed
programming model.  In fact, this is a very basic tenet of the operation
of the nvml libraries on pmem.io.

That aside, let me see if I understand you correctly.

An application creates a file and writes to every single block in the
thing, sync's it, closes it.  It then opens it back up, calls mmap with
this new MAP_DAX flag or on a file system mounted with -o dax, and
proceeds to access the file using loads and stores.  It persists its
data by using non-temporal stores, flushing and fencing cpu
instructions.

If I understand you correctly, you're saying that that application is
not written correctly, because it needs to call fsync to persist
metadata (that it presumably did not modify).  Is that right?

-Jeff

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]