Re: Why not just return an error?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/10/16 19:52, Phil Turmel wrote:

MD raid has no idea what is at any given sector.  And with a
near-infinite variety of layering choices, there's no way it's going to.
   That's why *you* have to do this.  You trimmed my description of the
only "easy option" actually trustable.

I actually wanted to ask about that. Can you really ddrescue a drive
with a "hole" in it, re-add it and expect it to work?.. What happens if
you try to read from that "hole" again? And while I'm talking about
re-adding, when does it become impossible to "re-add" a drive?..

Yes, ddrescue replaces unreadable areas with zeroes.  If those blocks
were part of a file, then the file will have zeroes in it.  But they
might have been where an inode or dirent were stored, in which case you
get orphaned data elsewhere.  You need fsck to minimize that.

Ah, yes - in this case it's the only drive with this piece of information, and md doesn't keep any checksums or anything, so it will simply return those zeroes. Thanks for explaining this!


ddrescue can provide a listing of the sectors it replaced so you can use
filesystem forensic tools to pinpoint the problems (which file, etc).

Note that all of the above are manual operations -- mdadm has no
knowledge of the upper layers.

None of the above uses --re-add.  Just assembly or forced assembly.
Re-add is only to return a kicked drive to a *functional* array when the
failure reason isn't really the drive.  (Controller, cable, power
supply, etc.)  And re-add is only helpful if the array members have
write-intent bitmaps so MD can figure out which parts of the re-added
disk are out of date.  Re-add can be used if a drive is kicked for
timeout mismatch, but is only helpful if the mismatch is addressed first.

"Forced assembly"... That's one thing I've missed. So forced-assembling a faulty drive back into a collapsed array after each failure would basically do what I wanted to do - and with no inconsistencies, because the array stops the moment the drive was kicked; but I can see why this is not a good idea. %)

So, "re-adding" is only possible with a functional array, and only when a write-intent bitmap is used. But I remember clearly that not long ago, one of my drives failed (most likely due to a cable popping off) and refused to re-add into a mirror with a bitmap, so I'm still wondering why was it not possible. At least in theory, as long as there is a bitmap, it should be possible to re-add, no matter how much later, right?..


--
darkpenguin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux