On Thu, Jan 21, 2016 at 1:58 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Thu, Jan 21, 2016 at 08:37:11AM -0800, Dan Williams wrote: >> On Sun, Jan 3, 2016 at 9:54 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> > From: Dave Chinner <dchinner@xxxxxxxxxx> >> > >> > Rather than just being able to turn DAX on and off via a mount >> > option, some applications may only want to enable DAX for certain >> > performance critical files in a filesystem. >> > >> > This patch introduces a new inode flag to enable DAX in the v3 inode >> > di_flags2 field. It adds support for setting and clearing flags in >> > the di_flags2 field via the XFS_IOC_FSSETXATTR ioctl, and sets the >> > S_DAX inode flag appropriately when it is seen. >> > >> > When this flag is set on a directory, it acts as an "inherit flag". >> > That is, inodes created in the directory will automatically inherit >> > the on-disk inode DAX flag, enabling administrators to set up >> > directory heirarchies that automatically use DAX. Setting this flag >> > on an empty root directory will make the entire filesystem use DAX >> > by default. >> >> When switching from page-cache to DAX, don't we need to flush existing >> page cache mappings and remap directly? Or, is the thought that >> userspace needs to comprehend the presence of mixed mappings after >> changing S_DAX? > > The change should be transparent to userspace. In general, I don't > expect users to change the behaviour of files that are in active use > (why would you do that?). If by accident someone tries to dynamically change S_DAX while existing mappings are established I think the kernel should just return EBUSY. I was not proposing we support it as a first-class operation. > This patch is really just introducing the > flag, the userspace API and making it propagate correctly via the > on-disk format. We'll fix up whatever problems with switching it > on/off dynamically as we go, like we do with most experimental > features once the on-disk behaviour is sorted out. Ok. > i.e. I've already got a couple of fixes we need to add to this - the > DAX flag is only valid on CRC enabled filesystems, I assume for torn-write protection? The CRC limitation makes sense, but we theoretically could get the same effect by using a separate logdev that does not tear writes, right? > so we need to > check that in the ioctl (general problem with using di_flags2 field, > not DAX flag specific issue). Adding a code to sync and unmap when > changing the flag is probably also necessary in the ioctl - I don't > have code to do that yet, but I have been thinking about it... Matthew and I have also talked about a modification of mincore(2) to interrogate the effective mapping mode. It seems we'll need that or something like it given the growing list of caveats with setting up a DAX mapping. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html