could we implement a more flexible raid1? maybe with checksum? wrong checksum = page failed or page with errors page should be correct for a high performace example, mirror 1 = 4096 page size mirror 2 = 8192 page size mirror 3 = 512 page size a good value for raid page size is 8192 (is multiple of 4096 and 512) the checksum size shoud be multiple of page size for example 1byte for each 512bytes, with a page of 8192bytes, we have 8192 pages checksum with only one page... what's the `new` raid1 with checksum idea? considering 8192 page size, with 3 mirrors... the error is detect by page, not by mirror pages make filesystem fast (ok, a little less than without raid) low disk use for checksum what we need... example: a raid with 8192001bytes page size=8192 <- give at mdadm --create checksum size per page = crc32? 4 bytes <- give at mdadm --create ??? total pages = floor(size/page size) = floor(8.192.001/8192) = 1000 (~1000,000122, we will lose 1 byte...) checksums per page size = floor(page size / checksum size) = floor(8192/4) = 2048 total checksum pages = ceil(total pages / checksums per page size) = ceil(1000 / 2048) = 1 (0,48828125 we will have a lot of checksum without use) total data pages = total pages - check sum pages = 1000 - 1 = 999 total size for filesystem = total data pages * page size = 999 * 8192 = 8.183.808 bytes should we usa more information? what about what's the newest drive? for example, we remove disk1 and disk2,3 are online, so write to 2,3 will make 1 older... should we use disk last write time information? maybe a page just for information? this could help us for check what's the currently working disk, checksum should be included with this value, for example 4096 bytes + this page value? or a page for checksum and a page for last write time value? the idea is help to know what's the newest value, a page startup could allow us to sync pages on each disk ideas: *it does not do 'voting' on RAID1 with more than 2 devices this could be done with per page last write time (raid 5 or raid6?) *obviously it does not have per-block checksums anywhere a per block checksum (raid 5 or raid6?) got? any idea? for example, imagine that we have ten 1TB disks and we want a 1TB 'raid' disk, the best option is RAID1 today, a mirror on every disk, and a read speed very fast (if we could select right read algorithm, for example closest head position, fastest read time, round robin, page module per mirrors on raid (for example, 10 disks, a read at page 1, will read for disk 1, a read from page 12 will read from disk 2, page 23, 3, 13, 43, will read from disk 3, 'page number' mod 'mirrors on raid' = disk to read) a fast resume, reading about openbsd we could get: write algorithm (what disk should be write? raid 0 with strip for example) read algorithm (what disk should be read? raid1 with good disks, could read with closest head position, fastest read time, round robin, etc...) strip algorithm (raid0, raid0 with strip) mirror algorithm (raid1) checksum algorithm (none = raid1, crc disk ~ raid 5/6, crc page per mirror = raid1 with checksum) correction algorithm (?? any idea) sync algorithm (per page / per disk ??) start disk algorithm (per page? per disk? last write time? incremental write number?) checksum/correction location (at each disk more secure, or, at external disk / file less secure) a mdadm with all this options could make a very flexible raid solution... i don't believe that we could have a more flexible than this, any idea?? we have a lot of work done today... just remap it, ok we have more thinks to do... anyone want a new project? md2? like v4l2? 2011/1/5 Roman Mamedov <rm@xxxxxxxxxx>: > On Wed, 5 Jan 2011 18:03:47 -0600 > "Leslie Rhorer" <lrhorer@xxxxxxxxxxx> wrote: > >> RAID1 certainly offers the most robust solution, especially >> with more than 1 mirror. > >> RAID1 is as safe as it gets > > Are you sure about that? Considering that mdadm's handling of corrupt data on > RAID1 devices is pretty simplistic (obviously it does not have per-block > checksums anywhere, it does not do 'voting' on RAID1 with more than 2 > devices), it basically has no way of knowing if a block of data is returned > differently by some of the component devices, which one has the 'correct' > data. From what I understand, RAID5 and especially RAID6 give a much better > protection in this situation. > > > > -- > With respect, > Roman > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html