Maybe you missed my post from yesterday. http://marc.theaimsgroup.com/?l=linux-raid&m=110559211400459&w=2 No superblock was to prevent overwriting data on the failing component of the top RAID5 array. If you build the top array with degraded RAID1 arrays, then use a super block for the RAID1 arrays. Also, so all of the RAID1 arrays don't seem degraded, configure them with only 1 device. Grow them to 2 devices when needed. Then shrink them back to 1 when done. The RAID1 idea will not work since a bad block will take out the RAID1. But there are more issues, see the above URL. Guy -----Original Message----- From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Peter T. Breuer Sent: Thursday, January 13, 2005 12:17 PM To: linux-raid@xxxxxxxxxxxxxxx Subject: Re: Spares and partitioning huge disks Guy <bugzilla@xxxxxxxxxxxxxxxx> wrote: > Peter said: > "Well, I don't see where there's any window in which its degraded." > > These are the steps that cause the window (see "Original Message" for full > details): > > 1. fail out the chosen drive. (array is now degraded) I would suggest "don't do that then". Start with an array of degraded RAID1s, as I suggested, and add in an extra disk to one of the raid1s, wait till it syncs, then remove the original component. Instant new (degraded) RAID1 in the place of the old, and the array above none the wiser. > 2. combine it with the spare in a raid1 with no superblock (re-synce starts) Why "no superblock"? Oh well - let's leave it as a mystery. > 3. add this raid1 back into the main array. (The main array is now in-sync > other than any changes that occurred since you failed the disk in step 1) Well, if you have an array of arrays it seems that the main array must have been degraded too, but I don't see where you took the subarray out of it in the sequence above (in order to add it back in now). The problem pointed out is that if the disk you are going to swap out is faulty, there's no way of copying from it perfectly. The read patch I posted a few days ago will help, but it won't paper over real sector errors - it may allow the copy to processd, however (I'll have to check what happens during a sync). So one has to substitute using data from the redundant parts of the array above (in the array-of-arrays solution). But there's no communication at present :(. Well, 1) if one were to use bitmaps, I would suggest that in the case of an array of arrays that the bitmap be shared between an array and its subarrays - do we really care in which disk a problem is? No - we know we just have to try and find some good data and correct a problem in that block and we can go searching for the details if and when we need. 2) I don't see any problem in, even without a bitmap, simply augmenting the repair strategy (which you people don't have yet, heh) for read errors to including getting the data from the array above if we are in a subarray, not just using our own redundancy. Peter - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html