> On Sep 3, 2019, at 12:49 PM, Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx> wrote: > > Currently md raid0/linear are not provided with any mechanism to validate > if an array member got removed or failed. The driver keeps sending BIOs > regardless of the state of array members, and kernel shows state 'clean' > in the 'array_state' sysfs attribute. This leads to the following > situation: if a raid0/linear array member is removed and the array is > mounted, some user writing to this array won't realize that errors are > happening unless they check dmesg or perform one fsync per written file. > Despite udev signaling the member device is gone, 'mdadm' cannot issue the > STOP_ARRAY ioctl successfully, given the array is mounted. > > In other words, no -EIO is returned and writes (except direct ones) appear > normal. Meaning the user might think the wrote data is correctly stored in > the array, but instead garbage was written given that raid0 does stripping > (and so, it requires all its members to be working in order to not corrupt > data). For md/linear, writes to the available members will work fine, but > if the writes go to the missing member(s), it'll cause a file corruption > situation, whereas the portion of the writes to the missing devices aren't > written effectively. > > This patch changes this behavior: we check if the block device's gendisk > is UP when submitting the BIO to the array member, and if it isn't, we flag > the md device as MD_BROKEN and fail subsequent I/Os to that device; a read > request to the array requiring data from a valid member is still completed. > While flagging the device as MD_BROKEN, we also show a rate-limited warning > in the kernel log. > > A new array state 'broken' was added too: it mimics the state 'clean' in > every aspect, being useful only to distinguish if the array has some member > missing. We rely on the MD_BROKEN flag to put the array in the 'broken' > state. This state cannot be written in 'array_state' as it just shows > one or more members of the array are missing but acts like 'clean', it > wouldn't make sense to write it. > > With this patch, the filesystem reacts much faster to the event of missing > array member: after some I/O errors, ext4 for instance aborts the journal > and prevents corruption. Without this change, we're able to keep writing > in the disk and after a machine reboot, e2fsck shows some severe fs errors > that demand fixing. This patch was tested in ext4 and xfs filesystems, and > requires a 'mdadm' counterpart to handle the 'broken' state. > > Cc: Song Liu <songliubraving@xxxxxx> > Reviewed-by: NeilBrown <neilb@xxxxxxx> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx> Applied to md-next. Thanks!