On Wed, Aug 27, 2014 at 01:06:13PM -0700, Andrew Morton wrote: > On Tue, 26 Aug 2014 23:45:20 -0400 Matthew Wilcox <matthew.r.wilcox@xxxxxxxxx> wrote: > > > One of the primary uses for NV-DIMMs is to expose them as a block device > > and use a filesystem to store files on the NV-DIMM. While that works, > > it currently wastes memory and CPU time buffering the files in the page > > cache. We have support in ext2 for bypassing the page cache, but it > > has some races which are unfixable in the current design. This series > > of patches rewrite the underlying support, and add support for direct > > access to ext4. > > Sat down to read all this but I'm finding it rather unwieldy - it's > just a great blob of code. Is there some overall > what-it-does-and-how-it-does-it roadmap? The overall goal is to map persistent memory / NV-DIMMs directly to userspace. We have that functionality in the XIP code, but the way it's structured is unsuitable for filesystems like ext4 & XFS, and it has some pretty ugly races. Patches 1 & 3 are simply bug-fixes. They should go in regardless of the merits of anything else in this series. Patch 2 changes the API for the direct_access block_device_operation so it can report more than a single page at a time. As the series evolved, this work also included moving support for partitioning into the VFS where it belongs, handling various error cases in the VFS and so on. Patch 4 is an optimisation. It's poor form to make userspace take two faults for the same dereference. Patch 5 gives us a VFS flag for the DAX property, which lets us get rid of the get_xip_mem() method later on. Patch 6 is also prep work; Al Viro liked it enough that it's now in his tree. The new DAX code is then dribbled in over patches 7-11, split up by functional area. At each stage, the ext2-xip code is converted over to the new DAX code. Patches 12-18 delete the remnants of the old XIP code, and fix the things in ext2 that Jan didn't like when he reviewed them for ext4 :-) Patches 19 & 20 are the work to make ext4 use DAX. Patch 21 is some final cleanup of references to the old XIP code, renaming it all to DAX. > Some explanation of why one would use ext4 instead of, say, > suitably-modified ramfs/tmpfs/rd/etc? ramfs and tmpfs really rely on the page cache. They're not exactly built for permanence either. brd also relies on the page cache, and there's a clear desire to use a filesystem instead of a block device for all the usual reasons of access permissions, grow/shrink, etc. Some people might want to use XFS instead of ext4. We're starting with ext4, but we've been keeping an eye on what other filesystems might want to use. btrfs isn't going to use the DAX code, but some of the other pieces will probably come in handy. There are also at least three people working on their own filesystems specially designed for persistent memory. I wish them all the best ... but I'd like to get this infrastructure into place. > Performance testing results? I haven't been running any performance tests. What sort of performance tests would be interesting for you to see? > Carsten Otte wrote filemap_xip.c and may be a useful reviewer of this > work. I cc'd him on some earlier versions and didn't hear anything back. It felt rude to keep plying him with 20+ patches every month. > All the patch subjects violate Documentation/SubmittingPatches > section 15 ;) errr ... which bit? I used git format-patch to create them. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html