The idea is actually to be able to use more than two disks, like raid 5 or raid 6, except with parity on their own disks instead of distributed across disks, and data kept own their own disks as well. I've used SnapRaid a bit and was just making some changes to my own setup when I got the idea as to why something similar can't be done in block device level, but keeping one of the advantages of SnapRaid-like systems which is if any data disk is lost beyond recovery, then only the data on that data disk is lost due to the fact that the data on the other data disks are still their own complete filesystem, and providing real-time updates to the parity data. So for instance /dev/sda - may be data disk 1, say 1TB /dev/sdb - may be data disk 2, 2TB /dev/sdc - may be data disk 3, 2TB /dev/sdd - may be parity disk 1 (maybe a raid-5-like setup), 2TB /dev/sde - may be parity disk 2 (maybe a raid-6-like setup), 2TB The parity disks must be larger than the largest data disks. If a given block is not present on a data disk (due to it being smaller than the other data disks) it is computed as all zeroes. So the parity for position 1.5TB would use zeros from /dev/sda and whatever the block is from /dev/sdb and /dev/sdc In normal raid 5/6, this would only expose a single logical block device /dev/md0 and the data and parity would distributed across the disks. If any data disk is lost without any parity, it's not possible to recover any data since the blocks are scattered across all disks. What good is any file if it is missing every other third block? Even trying to figure out the files would be virtually impossible since even the structures of any filesystem and everything else on /dev/md0 is also distributed. My idea is basically, instead of exposing a single logical block device that is the 'joined' array, each data disk would be exposed as its own logical block device. /dev/sda1 may be exposed as /dev/fr1 (well, some better name), /dev/sdb1 as /dev/fr2, /dev/sdc1 as /dev/fr3, the parity disks would not be exposed as a logical block device. The blocks would essentially be a 1-1 identity between /def/fr1 and /dev/sda1 and so on except a small header on /dev/sda, so block 0 on /dev/fr1 may actually be block 8 on /dev/sda. If any single disk were ever removed from the array, the full data on it could still be accessed via losetup with an offset, and any file systems that were built on it could be read independently from any of the other data disks. The difference from traditional raid is that, if every disk somehow got damaged beyond recovery excepted for /dev/sda, it would still be possible to recover whatever data was on that disk since it was exposed to the system as its own block device, with an entire filesystem on it. The same with /dev/sdb and /dev/sdc. Any write to any of the data block devices would automatically also write parity. Any read from any data block device if it is failed would recompute from available parity in real time, except with degraded performance. The file systems created on the exposed logical block devices could be used however the user sees fit, maybe related such as a union/merge pool file system, or unrelated such as /home on the /dev/fr1 filesystem and /usr/local on /dev/fr2. There would be no read/write performance increase, since reads from the a single logical block device maps to the same physical device. But there would be the typical redundancy of raid, and if during any recovery/rebuild another disk fails which would prevent the recovery of the data, only the data on the lost data disks is gone. Thanks, Brian Vanderburg II On 8/28/20 11:31 AM, antlists wrote: > On 24/08/2020 18:23, Brian Allen Vanderburg II wrote: >> Just an idea I wanted to put out there to see if there were any >> merit/interest in it. > > I hate to say it, but your data/parity pair sounds exactly like a > two-disk raid-1 mirror. Yes, parity != mirror, but in practice I think > it's a distinction without a difference. > > Cheers, > Wol