On 2005-08-18T15:28:41, Neil Brown <neilb@xxxxxxxxxxxxxxx> wrote: > If we want to mirror a single drive in a raid5 array, I would really > like to do that using the raid1 personality. > e.g. > suspend io > remove the drive > build a raid1 (with no superblock) using the drive. > add that back into the array > resume io. I hate to say this, but this is something where the Device Mapper framework, with it's suspend/resume options and the ability to change the mapping atomically. Maybe copying some of the ideas would be useful. Freeze, reconfigure one disk to be RAID1, resume - all IO goes on while at the same time said RAID1 re-mirrors to the new disk. Repeat with a removal later. > To handle read failures, I would like the first step to be to re-write > the failed block. I believe most (all?) drives will relocate the > block if a write cannot succeed at the normal location, so this will > often fix the problem. Yes. This would be highly useful. > A userspace process can then notice an unacceptable failure rate and > start a miror/swap process as above. Agreed. Combined with SMART monitoring, this could provide highly useful features. > This possible doesn't handle the possibility of a write failing very > well, but I'm not sure what your approach does in that case. Could > you explain that? I think a failed write can't really be handled - it might be retried once or twice, but then the way to proceed is to kick the drive and rebuild the array. > It also means that if the raid1 rebuild hits a read-error it cannot > cope whereas your code would just reconstruct the block from the rest > of the raid5. Good point. One way to fix this would be to have a callback to one level up "Hi, I can't read this section, can you reconstruct and give it to me?". (Which is a pretty ugly hack.) However, that would also assume that the data on the disk which _can_ be read still can be trusted. I'm not sure I'd buy that myself, untrusted. But a periodic background consistency check for RAID might help convince users that this is indeed the case ;-) If you can no longer pro-actively reconstruct the disk because it has indeed failed, maybe treating it like a failed disk and rebuilding the array in the "classic" fashion isn't the worst idea, though. Sincerely, Lars Marowsky-Brée <lmb@xxxxxxx> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html