On Tue, Feb 23, 2016 at 10:07:07AM +0000, Rudoff, Andy wrote: > > > [Hi Andy - care to properly line break after ~75 character, that makes > > ready the message a lot easier, thanks!] > > My bad. > > >> The instructions give you very fine-grain flushing control, but the > >> downside is that the app must track what it changes at that fine > >> granularity. Both models work, but there's a trade-off. > > > > No, the cache flush model simply does not work without a lot of hard > > work to enable it first. > > It's working well enough to pass tests that simulate crashes and > various workload tests for the apps involved. And I agree there > has been a lot of hard work behind it. I guess I'm not sure why you're > saying it is impossible or not working. > > Let's take an example: an app uses fallocate() to create a DAX file, > mmap() to map it, msync() to flush changes. The app follows POSIX > meaning it doesn't expect file metadata to be flushed magically, etc. > The app is tested carefully and it works correctly. Now the msync() > call used to flush stores is replaced by flushing instructions. > What's broken? You haven't told the filesytem to flush any dirty metadata required to access the user data to persistent storage. If the zeroing and unwritten extent conversion that is run by the filesytem during write faults into preallocated blocks isn't persistent, then after a crash the file will read back as unwritten extents, returning zeros rather than the data that was written. msync() calls fsync() on file back pages, which makes file metadata changes persistent. Indeed, if you read the fdatasync man page, you might have noticed that it makes explicit reference that it requires the filesystem to flush the metadata needed to access the data that is being synced. IOWs, the filesystem knows about this dirty metadata that needs to be flushed to ensure data integrity, userspace doesn't. Not to mention that the filesystem will convert and zero much more than just a single cacheline (whole pages at minimum, could be 2MB extents for large pages, etc) so the filesystem may require CPU cache flushes over a much wider range of cachelines that the application realises are dirty and require flushing for data integrity purposes. The filesytem knows about these dirty cache lines, userspace doesn't. IOWs, your userspace library may have made sure the data it modifies is in the physical location via your userspace CPU cache flushes, but there can be a lot of stuff it doesn't know about internal to the filesytem that also needs to be flushed to ensure data integrity is maintained. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>