On Thu, 2005-08-18 at 15:28 +1000, Neil Brown wrote: > However I think I would like to do it a little bit differently. thanks for your reply, interesting ideas! > If we want to mirror a single drive in a raid5 array, I would really > like to do that using the raid1 personality. > e.g. > suspend io > remove the drive > build a raid1 (with no superblock) using the drive. > add that back into the array > resume io. > > Then another drive can be added to the raid1 and synced. > > This allows shuffling of drives even when they haven't actually > failed. the current hack allows too, but using the raid1 personality would be a clear solution, I agree. although, I have some doubt.. very simple task to mirror a drive (some lines of code in raid5.c in this master-slave method), but if we call raid1 into the game, the situation goes more difficult: - we should transfer the badblock cache at the building of raid1 - the raid1.c should be hacked to make requests for data if the sync has stopped due to read error and the parent is a raid5 array - many steps needed to make the change, error handling become more complex anyway, I try to change my patch to use raid1 personality on this weekend > To handle read failures, I would like the first step to be to re-write > the failed block. I believe most (all?) drives will relocate the > block if a write cannot succeed at the normal location, so this will > often fix the problem. I think it's an easy task, the question is, how can we check if we have a point to do that. I mean, if we rewrote a bad stripe but there's no auto reallocation or the drive is already using all of the spare sectors our write will success due to drives cache but every time when we reread it we will got back a bad sector and the rewrite over and over is become pointless.. currently, with my hack, a userspace program can issue a read-write cycle based on bad sector list in /proc/mdstat and we can hope that solves the problem. may be an another solution a table with recently rewritten blocks, if something has appear too often, we put it on a 'total failed' list and never be touched again.. but I'm not sure that the latest is better.. with badblock tolerance the 'timed rewrite from userspace' sounds like a good solution, IMHO > This possible doesn't handle the possibility of a write failing very > well, but I'm not sure what your approach does in that case. Could > you explain that? I also can't do anything with that, if a write fails, the drive'll be marked failed, immediately -- dap - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html