-----Original Message----- From: David Greaves [mailto:david@xxxxxxxxxxxx] Sent: Sunday, May 18, 2008 4:12 AM To: Janos Haar Cc: linux-raid@xxxxxxxxxxxxxxx; David Lethe Subject: Re: questions about softraid limitations Janos Haar wrote: > At this time, i working in my data recovery company, and some times need Ah - I missed this too. > to recover the broken hw raid arrays too. > (with md arrays, we have no problem at all. :-) ) Nice quote for "the benefits of software raid" somewhere :) > In your rows, we talking about 2 cases: > > a, disk hw problem (only bad sectors, the completely failed disk is in > 'b' case) > Yes, the ddrescue is the best way, to do the recovery, but: > The ddrescue is too agressive with default -e 0 setting! > This can be easily fail down the drive! (dependig the reason of the bad > sectors) OK, worth knowing - what would you suggest? > And with the images, we have another problem! > The 0x00 holes. > The hw or md have no deal about where we need recover from parity and > where we have real zero blocks.... > Overall this is why data recovery companys learning and developing more > and more.... :-) Hmm - I wonder if things like ddrescue could work with the md bitmaps to improve this situation? Is this related to David Lethe's recent request? > I need no help at this time, i just want to share my ideas, to helping > upgrading/developing md, and helping for people.... OK - ta. David ----------- No, we are trying two different approaches. In my situation, I already know that the data is munged on a particular block, so the solution is to calculate the correct data from surviving parity, and just write the new value. There is no reason to worry about md bitmaps, or even whether or not there are 0x00 holes. I am not trying to fix a problem such as a rebuild gone bad or an intermittent disk failure that put the md array in a partially synced, and totally confused state. [I also do data recovery, and have a software bag-o-tricks, but I only take on jobs relating to certain hardware RAID controllers where I am intimately familiar with the metadata layout ... and have a software bag-o-tricks that nearly always have to modify given the original configuration, and chain of events]. My desire is to limit damage before a full disk recovery needs to be performed, by insuring that there are no double-errors that will make stripe-level recovery impossible (assuming they aren't using RAID6). For that I need a mechanism to repair a stripe given a physical disk and offset. There is no completely failed disk to contend with, merely a block of bad data that will repair itself once I issue a simple write command. (trick, of course, is to figure out exactly what & where to right it and deal with potential locking issues relating to file system). -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html