On Fri, 15 Jan 2010 10:36:39 -0500 Brett Russ <bruss@xxxxxxxxxxx> wrote: > On 01/14/2010 02:24 PM, Michael Evans wrote: > > On Thu, Jan 14, 2010 at 7:10 AM, Brett Russ<bruss@xxxxxxxxxxx> wrote: > >> Slightly related to my last message here Re:non-fresh behavior, we have seen > >> cases where the following happens: > >> * healthy 2 disk raid1 (disks A& B) incurs a problem with disk B > >> * disk B is removed, unit is now degraded > >> * replacement disk C is added; recovery from A to C begins > >> * during recovery, disk A incurs a brief lapse in connectivity. At this > >> point C is still up yet only has a partial copy of the data. > >> * a subsequent assemble operation on the raid1 results in disk A being > >> kicked out as non-fresh, yet C is allowed in. > > > > I believe the desired and logical behavior here is to refuse running > > an incomplete array unless explicitly forced to do so. Incremental > > assembly might be what you're seeing. > > This brings up a good point. I didn't mention that the assemble in the > last step above was forced. Thus, the "bug" I'm reporting is that under > duress, mdadm/md chose to assemble the array with a partially recovered > (but "newer") member instead of the older member which was the recovery > *source* for the newer member. > > What I think should happen is members that are *destinations* for > recovery should *never* receive a higher event count, timestamp, or any > other marking than the recovery sources. By definition they are > incomplete and can't be trusted, thus they should never trump a complete > member during assemble. I would assume the code already does this but > perhaps there is a hole. > > One other piece of information that may be relevant--we're using 2 > member RAID1 units with one member marked write-mostly. At this time, I > don't have the specifics for which member (A or B) was the write-mostly > member in the example above, but I can find that out. > > > I very much recommend running it read-only until you can determine which > > assembly pattern produces the most viable results. > > Good tip. We were able to manually recover the array in the case > outlined above, now we're looking back to fixing the kernel to prevent > it happening again. > Thanks for the report. It sounds like a real problem. I'm travelling at the moment so reproducing it would be a challenge. If you are able to, can you report the output of mdadm -E /dev/list-of-devices at the key points in the process, and also add "-v" to any mdadm --assemble command you use, and report the output? Thanks, NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html