On Tue, 12 Feb 2013 16:05:18 -0500 (EST) Joe Lawrence <Joe.Lawrence@xxxxxxxxxxx> wrote: > Hi Neil, > > I believe I found a bug when failing and removing component devices from > read-only and auto-read-only MD RAID devices. Thanks for reporting them! > > Prior to commit 1ca69c4b "md: avoid taking the mutex on some ioctls", when > failing a read-only MD device component via mdadm --fail, EROFS was > returned. Failing (and removing) auto-read-only components was permitted. I wonder what we really want here.... Given that a read error on a read-only array will mark the device as faulty, it seems fair to allow "--fail" to also mark a device as faulty on a read-only array. Do you have a particular preference for having "mdadm --fail" fail with EROFS on a readonly array? > > After the change, MD allows the failure of component devices for both > read-only and auto-read-only RAID devices. Running mdadm --remove will > return EROFS when attempted on a read-only device. I suspect this is preferred behaviour. > > It gets a little interesting when trying to remove a component from an > auto-read-only MD. The *first* attempt will fail with EBUSY as the rdev > was left in the Blocked state by the SET_DISK_FAULTY ioctl. However, > because the HOT_REMOVE_DISK ioctl made it through to the "switch to rw mode > if started auto-readonly" code, it has been transitioned to read-write. > Subsequent trips through the device's mddev->thread may clear the Blocked > flag via md_update_sb (should the reconfig_mutex be available). A second or > third attempt at removing the failed component from the auto-read-only MD > would then succeed. This definitely a bit odd. In general "--remove" can fail with EBUSY for a short while after the failure, but this happens after an arbitrary delay. I think we just want to allow the "set to read-write" process to complete cleanly before trying the actual removal. Patch below seems to work for me. Can you confirm please? Thanks, NeilBrown diff --git a/drivers/md/md.c b/drivers/md/md.c index 8b557d2..292cc2f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6529,7 +6529,17 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode, mddev->ro = 0; sysfs_notify_dirent_safe(mddev->sysfs_state); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - md_wakeup_thread(mddev->thread); + /* mddev_unlock will wake thread */ + /* If a device failed while we were read-only, we + * need to make sure the metadata is updated now. + */ + if (test_bit(MD_CHANGE_DEVS, &mddev->flags)) { + mddev_unlock(mddev); + wait_event(mddev->sb_wait, + !test_bit(MD_CHANGE_DEVS, &mddev->flags) && + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); + mddev_lock(mddev); + } } else { err = -EROFS; goto abort_unlock;
Attachment:
signature.asc
Description: PGP signature