On Monday March 5, eyal@xxxxxxxxxxxxxx wrote: > Neil Brown wrote: > > On Sunday March 4, pernegger@xxxxxxxxx wrote: > >>I have a mismatch_cnt of 384 on a 2-way mirror. > [trim] > >>3) Is the "repair" sync action safe to use on the above kernel? Any > >>other methods / additional steps for fixing this? > > > > "repair" is safe, though it may not be effective. > > "repair" for raid1 was did not work until Jan 26th this year. > > Before then it was identical in effect to 'check'. > > How is "repair" safe but not effective? When it finds a mismatch, how does > it know which part is correct and which should be fixed (which copy of > raid1, or which block in raid5)? It is not 'effective' in that before 26jan2007 it did not actually copy the chosen data on to the other drives. i.e. a 'repair' had the same effect as a 'check', which is 'safe'. > > When a disk fails we know what to rewrite, but when we discover a mismatch > we do not have this knowledge. It may corrupt the good copy of a raid1. If a block differs between the different drives in a raid1, then no copy is 'good'. It is possible that one copy is the one you think you want, but you probably wouldn't know by looking at it. The worst situation is the have inconsistent data. If you read and get one value, then later read and get another value, that is really bad. For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy and writing it over all other copies. For raid5 we assume the data is correct and update the parity. You might be able to imagine a failure scenario where this produces the 'wrong' result, but I'm confident that is the majority of cases it is as good as any other option. If we had something like ZFS which tracks checksums for all blocks, and could somehow get that information usefully into the md level, than maybe we could do something better. I suspect that it would be very rare for raid5 to detect a mismatch during a 'check', and raid1 would only see them when a write was aborted, such as swap can do, and filesystems might do occasionally (e.g. truncate a file that was recently written to). NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html