On Wed, Dec 21, 2016 at 08:53:46AM -0800, Dan Williams wrote: > On Tue, Dec 20, 2016 at 4:40 PM, Darrick J. Wong > <darrick.wong@xxxxxxxxxx> wrote: > > On Mon, Dec 19, 2016 at 05:18:40PM -0800, Dan Williams wrote: > >> On Mon, Dec 19, 2016 at 5:09 PM, Darrick J. Wong > >> <darrick.wong@xxxxxxxxxx> wrote: > >> > On Mon, Dec 19, 2016 at 02:11:49PM -0700, Ross Zwisler wrote: > >> >> On Fri, Sep 16, 2016 at 03:54:05PM +1000, Nicholas Piggin wrote: > >> >> <> > >> >> > Definitely the first step would be your simple preallocated per > >> >> > inode approach until it is shown to be insufficient. > >> >> > >> >> Reviving this thread a few months later... > >> >> > >> >> Dave, we're interested in taking a serious look at what it would take to get > >> >> PMEM_IMMUTABLE working. Do you still hold the opinion that this is (or could > >> >> become, with some amount of work) a workable solution? > >> >> > >> >> We're happy to do the grunt work for this feature, but we will probably need > >> >> guidance from someone with more XFS experience. With you out on extended leave > >> >> the first half of 2017, who would be the best person to ask for this guidance? > >> >> Darrick? > >> > > >> > Yes, probably. :) > >> > > >> > I think where we left off with this (on the XFS side) is some sort of > >> > fallocate mode that would allocate blocks, zero them, and then set the > >> > DAX and PMEM_IMMUTABLE on-disk inode flags. After that, you'd mmap the > >> > file and thereby gain the ability to control write persistents behavior > >> > without having to worry about fs metadata updates. As an added plus, I > >> > think zeroing the pmem also clears media errors, or something like that. > >> > > >> > <shrug> Is that a reasonable starting point? My memory is a little foggy. > >> > > >> > Hmm, I see Dan just posted something about blockdev fallocate. I'll go > >> > read that. > >> > >> That's for device-dax, which is basically a poor man's PMEM_IMMUTABLE > >> via a character device interface. It's useful for cases where you want > >> an entire nvdimm namespace/volume in "no fs-metadata to worry about" > >> mode. But, for sub-allocations of a namespace and support for > >> existing tooling, PMEM_IMMUTABLE is much more usable. > > > > Well sure... but otoh I was thinking that it'd be pretty neat if we > > could use the same code regardless of whether the target file was a > > dax-device or an xfs file: > > > > fd = open("<some path>", O_RDWR); > > fstat(fd, &statbuf): > > fallocate(fd, FALLOC_FL_PMEM_IMMUTABLE, 0, statbuf.st_size); > > p = mmap(NULL, statbuf.st_size, PROT_READ | PROT_WRITE, fd, 0); > > > > *(p + 42) = 0xDEADBEEF; > > asm { clflush; } /* or whatever */ > > > > ...so perhaps it would be a good idea to design the fallocate primitive > > around "prepare this fd for mmap-only pmem semantics" and let it the > > backend do zeroing and inode flag changes as necessary to make it > > happen. We'd need to do some bikeshedding about what the other falloc > > flags mean when we're dealing with pmem files and devices, but I think > > we should try to keep the userland presentation the same unless there's > > a really good reason not to. > > It would be interesting to use fallocate to size device-dax files... No. device-dax needs to die, not poison a bunch of existing file and block device APIs and behaviours with special snowflakes. Get DAX-enabled filesystems to do what you need, and get rid of this ugly, nasty hack. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html