> > > when the rebuild of the secondary completes. Commonly this would be > > > ideal, but if the secondary experienced any write errors (that were > > > recorded in the bad block log) then it would be best to leave both in > > > place until the sysadmin resolves the situation. So in the first > > > implementation this failing should not be automatic. > > > > Maybe putting the primary as "spare", i.e. not failed nor > > working, unless the "migration" was not successful. In that > > case the secondary device should be failed. > > Maybe ... but what if both primary and secondary have bad blocks on them? > What do I do then? IMHO this means migration was not sucessful, so you return to the original state, with the primary disk up and running. Assuming you realize the secondary has bad blocks, otherwise I do not think there are any possibilities. > > My use case here is disk "rotation" :-). That is, for example, a > > RAID-5/6 with n disks + 1 spare. Each X months/weeks/days/hours > > one disk is pulled out of the array and the spare one takes over. > > The pulled out disk will be the new spare (and powered down, possibly). > > The idea here is to have n disks which will have, after some time, > > different (increasing) power on hours, so to minimize the possibility > > of multiple failures. > > Interesting idea. This could be managed with some user-space tool that > initiates the 'hot-replace' and 'fail' from time to time and keeps track of > ages. Exactly, my idea was to have a daemon, which, time to time, maybe reading the power up hours from the SMART information, will remove the oldest disk replacing it with the youngest. There could be other policies, of course. > > > Better reporting of inconsistencies. > > > ------------------------------------ > > > > > > When a 'check' finds a data inconsistency it would be useful if it > > > was reported. That would allow a sysadmin to try to understand the > > > cause and possibly fix it. > > > > Could you, please, consider to add, for RAID-6, the > > capability to report also which device, potentially, > > has the problem? Thanks! > > I would rather leave that to user-space. If I report where the problem is, a > tool could directly read all the blocks in that stripe and perform any fancy > calculations you like. I may even write that tool (but no promises). I guess you have already the tool, don't you remember? :-) bye, -- piergiorgio -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html