Re: non-fresh data unavailable bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/14/2010 02:24 PM, Michael Evans wrote:
On Thu, Jan 14, 2010 at 7:10 AM, Brett Russ<bruss@xxxxxxxxxxx>  wrote:
Slightly related to my last message here Re:non-fresh behavior, we have seen
cases where the following happens:
* healthy 2 disk raid1 (disks A&  B) incurs a problem with disk B
* disk B is removed, unit is now degraded
* replacement disk C is added; recovery from A to C begins
* during recovery, disk A incurs a brief lapse in connectivity.  At this
point C is still up yet only has a partial copy of the data.
* a subsequent assemble operation on the raid1 results in disk A being
kicked out as non-fresh, yet C is allowed in.

I believe the desired and logical behavior here is to refuse running
an incomplete array unless explicitly forced to do so.  Incremental
assembly might be what you're seeing.

This brings up a good point. I didn't mention that the assemble in the last step above was forced. Thus, the "bug" I'm reporting is that under duress, mdadm/md chose to assemble the array with a partially recovered (but "newer") member instead of the older member which was the recovery *source* for the newer member.

What I think should happen is members that are *destinations* for recovery should *never* receive a higher event count, timestamp, or any other marking than the recovery sources. By definition they are incomplete and can't be trusted, thus they should never trump a complete member during assemble. I would assume the code already does this but perhaps there is a hole.

One other piece of information that may be relevant--we're using 2 member RAID1 units with one member marked write-mostly. At this time, I don't have the specifics for which member (A or B) was the write-mostly member in the example above, but I can find that out.

I very much recommend running it read-only until you can determine which
assembly pattern produces the most viable results.

Good tip. We were able to manually recover the array in the case outlined above, now we're looking back to fixing the kernel to prevent it happening again.

Thanks,
Brett

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux