Re: mdadm --fail doesn't mark device as failed?

Ross Boylan <ross@xxxxxxxxxxxxxxxx> · Wed, 21 Nov 2012 09:23:28 -0800

On Wed, 2012-11-21 at 18:10 +0100, Sebastian Riemer wrote:
> On 21.11.2012 18:03, Ross Boylan wrote:
> > On Wed, 2012-11-21 at 17:53 +0100, Sebastian Riemer wrote:
> >> On 21.11.2012 17:17, Ross Boylan wrote:
> >>> After I failed and removed a partition, mdadm --examine seems to show
> >>> that partition is fine.
> >>>
> >>> Perhaps related to this, I failed a partition and when I rebooted it
> >>> came up as the sole member of its RAID array.
> >>>
> >>> Is this behavior expected?  Is there a way to make the failures more
> >>> convincing?
> >> Yes, it is expected behavior. Without "mdadm --fail" you can't remove a
> >> device from the array. If you stop the array with the failed device,
> >> then the state is stored in the superblock.
> > I'm confused.  I did run mdadm --fail.  Are you saying that, in addition
> > to doing that, I also need to manipulate sysfs as you describe below?
> > Or were you assuming I didn't mdadm --fail?
> 
> You only need to set the value in the "errors" sysfs file additionally
> to ensure that this device isn't used for assembly anymore.
> 
> The kernel reports in "dmesg" then:
> md: kicking non-fresh sdb1 from array!
> 
OK.  So if I understand correctly, mdadm -fail has no effect that
persists past a reboot, and doesn't write to disk anything that would
prevent the use of the failed RAID component.(*)  But if I write to
sysfs, the failure wil persist across reboots.

This behavior is quite surprising to me.  Is there some reason for this
design?

Ross

(*) Also the different update or last use times either aren't recorded
or don't affect the RAID assembly decision.  For example, in my case md1
included sda3 and sdc3.  I failed sdc3, so that only sda3 had the most
current data.  But when the system rebooted, md1 was assembled from sdc3
only.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html