On Thu, Feb 11, 2016 at 02:59:14PM -0800, Dan Williams wrote: > On Thu, Feb 11, 2016 at 2:46 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Thu, Feb 11, 2016 at 12:58:38PM -0800, Dan Williams wrote: > >> On Thu, Feb 11, 2016 at 12:46 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > >> Maybe I don't need to worry because it's already the case that a > >> mmap of the raw device may not see the most up to date data for a > >> file that has dirty fs-page-cache data. > > > > It goes both ways. What happens if mkfs or fsck modifies the > > block device via mmap+DAX and then the filesystem mounts the block > > device and tries to read that metadata via the block device page > > cache? > > > > Quite frankly, DAX on the block device is a can of worms we really > > don't need to deal with right now. IMO it's a solution looking for a > > problem to solve, > > Virtualization use cases want to give large ranges to guest-VMs, and > it is currently the only way to reliably get 1GiB mappings. Precisely my point - block devices are not the best way to solve this problem. A file, on XFS, with a 1GB extent size hint and preallocated to be aligned to 1GB addresses (i.e. mkfs.xfs -d su=1G,sw=1 on the host filesystem) will give reliable 1GB aligned blocks for DAX mappings, just like a block device will. Peformance wise it's little different to using the block device directly. Management wise it's way more flexible, especially as such image files can be recycled for new VMs almost instantly via FALLOC_FL_FLAG_ZERO_RANGE. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>