Hi, Dave, Dave Chinner <david@xxxxxxxxxxxxx> writes: >> Another potential issue is that MAP_PMEM_AWARE is not enough on its >> own. If the filesystem or inode does not support DAX the application >> needs to assume page cache semantics. At a minimum MAP_PMEM_AWARE >> requests would need to fail if DAX is not available. > > They will always still need to call msync()/fsync() to guarantee > data integrity, because the filesystem metadata that indexes the > data still needs to be committed before data integrity can be > guaranteed. i.e. MAP_PMEM_AWARE by itself it not sufficient for data > integrity, and so the app will have to be written like any other app > that uses page cache based mmap(). > > Indeed, the application cannot even assume that a fully allocated > file does not require msync/fsync because the filesystem may be > doing things like dedupe, defrag, copy on write, etc behind the back > of the application and so file metadata changes may still be in > volatile RAM even though the application has flushed it's data. Once you hand out a persistent memory mapping, you sure as heck can't switch blocks around behind the back of the application. But even if we're not dealing with persistent memory, you seem to imply that applications needs to fsync just in case the file system did something behind its back. In other words, an application opening a fully allocated file and using fdatasync will also need to call fsync, just in case. Is that really what you're suggesting? > Applications have no idea what the underlying filesystem and storage > is doing and so they cannot assume that complete data integrity is > provided by userspace driven CPU cache flush instructions on their > file data. This is surprising to me, and goes completely against the proposed programming model. In fact, this is a very basic tenet of the operation of the nvml libraries on pmem.io. That aside, let me see if I understand you correctly. An application creates a file and writes to every single block in the thing, sync's it, closes it. It then opens it back up, calls mmap with this new MAP_DAX flag or on a file system mounted with -o dax, and proceeds to access the file using loads and stores. It persists its data by using non-temporal stores, flushing and fencing cpu instructions. If I understand you correctly, you're saying that that application is not written correctly, because it needs to call fsync to persist metadata (that it presumably did not modify). Is that right? -Jeff -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>