On 16/02/2019 06:39, Dave Chinner wrote: [..] >> We've supported this since mid 2018 and commit ba23cba9b3bd ("fs: >> allow per-device dax status checking for filesystems"). That is, >> we can have DAX on the XFS RT device indepently of the data device. >> >> That is, you set up pmem in three segments - two small identical >> segments start get mirrored with RAID1 as the data device, and >> the remainder as a block device that is dax capable set up as the >> XFS realtime device. Set the RTINHERIT bit on the root directory at >> mkfs time ("-d rtinherit=1") and then all the data goes to the DAX >> capable realtime device, and all the metadata goes to the software >> raided pmem block devices that aren't DAX capable. >> >> Problem already solved, yes? > > Sorry, this was meant to be a reply to Dan's email commenting about > some people needing mirrored metadata, not the parent that was > talking about whole device RAID... > > i.e. mirrored metadata w/ FS-DAX for data should already be a solved > problem... Trying to answer you both. But deferring the data redundancy to the application sounds like a no-go to me, sorry. We don't do that for "traditional" block storage (SCSI, NVMe, you name it). Some applications might already be able to handle it but definitively not all. I don't see your random DBMS like MariaDB or Postgres already doing data duplication over interleave sets of NV-DIMMs. And if you carve out a bit of your pmem space into an own namespace for the metadata (did I understand you right here?) you still have the problem that all data written to the DIMMs is interleaved in an interleave set, if I understand it correctly. So if one DIMM in your interleave set goes bad, you're lost anyways. Byte, Johannes -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@xxxxxxx +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850