On Mon, Dec 19, 2016 at 05:18:40PM -0800, Dan Williams wrote: > On Mon, Dec 19, 2016 at 5:09 PM, Darrick J. Wong > <darrick.wong@xxxxxxxxxx> wrote: > > On Mon, Dec 19, 2016 at 02:11:49PM -0700, Ross Zwisler wrote: > >> On Fri, Sep 16, 2016 at 03:54:05PM +1000, Nicholas Piggin wrote: > >> <> > >> > Definitely the first step would be your simple preallocated per > >> > inode approach until it is shown to be insufficient. > >> > >> Reviving this thread a few months later... > >> > >> Dave, we're interested in taking a serious look at what it would take to get > >> PMEM_IMMUTABLE working. Do you still hold the opinion that this is (or could > >> become, with some amount of work) a workable solution? > >> > >> We're happy to do the grunt work for this feature, but we will probably need > >> guidance from someone with more XFS experience. With you out on extended leave > >> the first half of 2017, who would be the best person to ask for this guidance? > >> Darrick? > > > > Yes, probably. :) > > > > I think where we left off with this (on the XFS side) is some sort of > > fallocate mode that would allocate blocks, zero them, and then set the > > DAX and PMEM_IMMUTABLE on-disk inode flags. After that, you'd mmap the > > file and thereby gain the ability to control write persistents behavior > > without having to worry about fs metadata updates. As an added plus, I > > think zeroing the pmem also clears media errors, or something like that. > > > > <shrug> Is that a reasonable starting point? My memory is a little foggy. > > > > Hmm, I see Dan just posted something about blockdev fallocate. I'll go > > read that. > > That's for device-dax, which is basically a poor man's PMEM_IMMUTABLE > via a character device interface. It's useful for cases where you want > an entire nvdimm namespace/volume in "no fs-metadata to worry about" > mode. But, for sub-allocations of a namespace and support for > existing tooling, PMEM_IMMUTABLE is much more usable. Well sure... but otoh I was thinking that it'd be pretty neat if we could use the same code regardless of whether the target file was a dax-device or an xfs file: fd = open("<some path>", O_RDWR); fstat(fd, &statbuf): fallocate(fd, FALLOC_FL_PMEM_IMMUTABLE, 0, statbuf.st_size); p = mmap(NULL, statbuf.st_size, PROT_READ | PROT_WRITE, fd, 0); *(p + 42) = 0xDEADBEEF; asm { clflush; } /* or whatever */ ...so perhaps it would be a good idea to design the fallocate primitive around "prepare this fd for mmap-only pmem semantics" and let it the backend do zeroing and inode flag changes as necessary to make it happen. We'd need to do some bikeshedding about what the other falloc flags mean when we're dealing with pmem files and devices, but I think we should try to keep the userland presentation the same unless there's a really good reason not to. --D -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>