Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 18, 2019 at 2:50 AM Johannes Thumshirn <jthumshirn@xxxxxxx> wrote:
>
> On 16/02/2019 06:39, Dave Chinner wrote:
> [..]
>
> >> We've supported this since mid 2018 and commit ba23cba9b3bd ("fs:
> >> allow per-device dax status checking for filesystems"). That is,
> >> we can have DAX on the XFS RT device indepently of the data device.
> >>
> >> That is, you set up pmem in three segments - two small identical
> >> segments start get mirrored with RAID1 as the data device, and
> >> the remainder as a block device that is dax capable set up as the
> >> XFS realtime device. Set the RTINHERIT bit on the root directory at
> >> mkfs time ("-d rtinherit=1") and then all the data goes to the DAX
> >> capable realtime device, and all the metadata goes to the software
> >> raided pmem block devices that aren't DAX capable.
> >>
> >> Problem already solved, yes?
> >
> > Sorry, this was meant to be a reply to Dan's email commenting about
> > some people needing mirrored metadata, not the parent that was
> > talking about whole device RAID...
> >
> > i.e. mirrored metadata w/ FS-DAX for data should already be a solved
> > problem...
>
> Trying to answer you both.
>
> But deferring the data redundancy to the application sounds like a no-go
> to me, sorry. We don't do that for "traditional" block storage (SCSI,
> NVMe, you name it). Some applications might already be able to handle it
> but definitively not all. I don't see your random DBMS like MariaDB or
> Postgres already doing data duplication over interleave sets of NV-DIMMs.

Oh, definitely agreed. I was just saying for the subset of
applications that *do* perform application level redundancy the lack
of metadata redundancy was a liability.

> And if you carve out a bit of your pmem space into an own namespace for
> the metadata (did I understand you right here?) you still have the
> problem that all data written to the DIMMs is interleaved in an
> interleave set, if I understand it correctly.
>
> So if one DIMM in your interleave set goes bad, you're lost anyways.

Yes, if you want to be able to survive the loss of a single-DIMM then
you need to disable interleaving and RAID across the DIMMs. However,
once you do that, dax for data can't work by definition, but RAID for
metadata would work.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux