Re: [Lsf-pc] [LSF/MM TOPIC] [ATTEND] Persistent memory

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 22 Jan 2014 07:20:43 +1100

On Tue, Jan 21, 2014 at 05:57:14AM -0800, Howard Chu wrote:
> Dave Chinner wrote:
> >On Mon, Jan 20, 2014 at 11:38:16PM -0800, Howard Chu wrote:
> >>Andy Lutomirski wrote:
> >>>On 01/16/2014 08:17 PM, Howard Chu wrote:
> >>>>Andy Lutomirski wrote:
> >>>>>I'm interested in a persistent memory track.  There seems to be plenty
> >>>>>of other emails about this, but here's my take:
> >>>>
> >>>>I'm also interested in this track. I'm not up on FS development these
> >>>>days, the last time I wrote filesystem code was nearly 20 years ago. But
> >>>>persistent memory is a topic near and dear to my heart, and of great
> >>>>relevance to my current pet project, the LMDB memory-mapped database.
> >>>>
> >>>>In a previous era I also developed block device drivers for
> >>>>battery-backed external DRAM disks. (My ideal would have been systems
> >>>>where all of RAM was persistent. I suppose we can just about get there
> >>>>with mobile phones and tablets these days.)
> >>>>
> >>>>In the context of database engines, I'm interested in leveraging
> >>>>persistent memory for write-back caching and how user level code can be
> >>>>made aware of it. (If all your cache is persistent and guaranteed to
> >>>>eventually reach stable store then you never need to fsync() a
> >>>>transaction.)
> >
> >I don't think that is true -  your still going to need fsync to get
> >the CPU to flush it's caches and filesystem metadata into the
> >persistent domain....
> >
> >>>Hmm.  Presumably that would work by actually allocating cache pages in
> >>>persistent memory.  I don't think that anything like the current XIP
> >>>interfaces can do that, but it's certainly an interesting thought for
> >>>(complicated) future work.
> >>>
> >>>This might not be pretty in conjunction with something like my
> >>>writethrough mapping idea -- read(2) and write(2) would be fine (well,
> >>>write(2) might need to use streaming loads), but mmap users who weren't
> >>>expecting it might have truly awful performance.  That especially
> >>>includes things like databases that aren't expecting this behavior.
> >>
> >>At the moment all I can suggest is a new mmap() flag, e.g.
> >>MAP_PERSISTENT. Not sure how a user or app should discover that it's
> >>supported though.
> >
> >The point of using the XIP interface with filesystems that are
> >backed by persistent memory is that mmap() gives userspace
> >applications direct acess to the persistent memory directly without
> >needing any modifications.  It's just a really, really fast file...
> 
> OK, I see that now. But that only works well when your persistent
> memory size is >= the size of the file(s) you want to work with.

It assumes that you have a persistent memory block device. If you
have a persistent memory block device, then if you want persistent
caching on top of the filesystem, use dm-cache or bcache to stack
the persistent memory on top of the slow block device. i.e. we
already have solutions to this problem.

> If you use persistent memory for the page cache, then you can use it
> with any filesystem of any arbitrary size.

We don't actually need (or, IMO, want) a the page
cache to have to be aware of persistent memory state. If the page
cache is persistent, then we need to store that persistent state
somewhere so that when the machine crashes and reboots, we can bring
the persistent page cache back up. That involves metadata to hold
state, crash recovery, etc. We've already got all that persistence
management in our filesystem implementations.

IOWs, persistent data and it's state belongs in the filesystem
domain, not the page cache domain.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>