On Wed, 29 Oct 2014 12:45:34 +0530 Anshuman Aggarwal <anshuman.aggarwal@xxxxxxxxx> wrote: > I'm outlining below a proposal for a RAID device mapper virtual block > device for the kernel which adds "split raid" functionality on an > incremental batch basis for a home media server/archived content which > is rarely accessed. > > Given a set of N+X block devices (of the same size but smallest common > size wins) > > the SplitRAID device mapper device generates virtual devices which are > passthrough for N devices and write a Batched/Delayed checksum into > the X devices so as to allow offline recovery of block on the N > devices in case of a single disk failure. > > Advantages over conventional RAID: > > - Disks can be spun down reducing wear and tear over MD RAID Levels > (such as 1, 10, 5,6) in the case of rarely accessed archival content > > - Prevent catastrophic data loss for multiple device failure since > each block device is independent and hence unlike MD RAID will only > lose data incrementally. > > - Performance degradation for writes can be achieved by keeping the > checksum update asynchronous and delaying the fsync to the checksum > block device. > > In the event of improper shutdown the checksum may not have all the > updated data but will be mostly up to date which is often acceptable > for home media server requirements. A flag can be set in case the > checksum block device was shutdown properly indicating that a full > checksum rebuild is not required. > > Existing solutions considered: > > - SnapRAID (http://snapraid.sourceforge.net/) which is a snapshot > based scheme (Its advantages are that its in user space and has cross > platform support but has the huge disadvantage of every checksum being > done from scratch slowing the system, causing immense wear and tear on > every snapshot and also losing any information updates upto the > snapshot point etc) > > I'd like to get opinions on the pros and cons of this proposal from > more experienced people on the list to redirect suitably on the > following questions: > > - Maybe this can already be done using the block devices available in > the kernel? > > - If not, Device mapper the right API to use? (I think so) > > - What would be the best block devices code to look at to implement? > > Neil, would appreciate your weighing in on this. Just to be sure I understand, you would have N + X devices. Each of the N devices contains an independent filesystem and could be accessed directly if needed. Each of the X devices contains some codes so that if at most X devices in total died, you would still be able to recover all of the data. If more than X devices failed, you would still get complete data from the working devices. Every update would only write to the particular N device on which it is relevant, and all of the X devices. So N needs to be quite a bit bigger than X for the spin-down to be really worth it. Am I right so far? For some reason the writes to X are delayed... I don't really understand that part. Sounds like multi-parity RAID6 with no parity rotation and chunksize == devicesize I wouldn't use device-mapper myself, but you are unlikely to get an entirely impartial opinion from me on that topic. NeilBrown
Attachment:
pgp43wTt3RRvJ.pgp
Description: OpenPGP digital signature