On Wed, Oct 18 2017, NeilBrown wrote: > On Wed, Oct 18 2017, Coly Li wrote: > >> We have a bug report about NULL dereference in md raid10 code. The >> operations are, >> - Create a raid10 device >> - add a spare disk >> - disconnect one of the online disks from the raid10 device >> - wait to the recovery happens on the spare disk >> - remove the spare disk which is recovering >> And sometimes a kernel oops of NULL dereference in md raid10 module can >> be observed, and crash tool reports the following information: >>> (gdb) list *(raid10d+0xad6) >>> 0x5de6 is in raid10d (../drivers/md/raid10.c:2300). >>> 2295 * the latter is free to free wbio2. >>> 2296 */ >>> 2297 if (wbio2 && !wbio2->bi_end_io) >>> 2298 wbio2 = NULL; >>> 2299 if (wbio->bi_end_io) { >>> 2300 atomic_inc(&conf->mirrors[d].rdev->nr_pending); >>> 2301 md_sync_acct(conf->mirrors[d].rdev->bdev, bio_sectors(wbio)); >>> 2302 generic_make_request(wbio); >>> 2303 } >>> 2304 if (wbio2) { >> At line 2300, conf->mirrors[d].rdev is NULL, this causes a NULL pointer >> dereference to conf->mirrors[d].rdev->nr_pending. After reading previous >> raid10 patches, I find conf->mirrors[d].rdev can be NULL at any point >> unless, >> - a counted reference is held >> - ->reconfig_mutex is held, or >> - rcu_read_lock() is held > > There is one other condition: MD_RECOVERY_RUNNING is set (without > MD_RECOVERY_DONE). > mirrors[d].rdev is only set to NULL by ->hot_remove_disk() which is only > called from remove_and_add_spares(), and that is never called while > resync/recovery/reshape is happening. > So there is no need for RCU protection here. > > Only ... remove_and_add_spares() *can* sometimes be called during > resync - > Commit: 8430e7e0af9a ("md: disconnect device from personality before trying to remove it.") > added a called to remove_and_add_spares() when "remove" is written to > the device/state attribute. That was wrong. > This: > > } else if (cmd_match(buf, "remove")) { > - if (rdev->mddev->pers) { > + if (rdev->mddev->pers && > + !test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { > clear_bit(Blocked, &rdev->flags); > remove_and_add_spares(rdev->mddev, rdev); > > should fix it. Actually, that doesn't even compile :-( I think this is a better fix. Thanks, NeilBrown From: NeilBrown <neilb@xxxxxxxx> Date: Thu, 19 Oct 2017 09:58:19 +1100 Subject: [PATCH] md: only allow remove_and_add_spares() when no sync_thread running. The locking protocols in md assume that a device will never be removed from an array during resync/recovery/reshape. When that isn't happening, rcu or reconfig_mutex is needed to protect an rdev pointer while taking a refcount. When it is happening, that protection isn't needed. Unfortuantely there is a case were remove_and_add_spares() is called when recovery might be happening: when "remove" is written to the device/state sysfs attribute. This call is an optimization and not essential so it doesn't matter if it fails. So change remove_and_add_spares() to abort early if resync/recovery/reshape is happening. As this can result in a NULL dereference, the fix is suitable for -stable. Cc: Tomasz Majchrzak <tomasz.majchrzak@xxxxxxxxx> Fixes: 8430e7e0af9a ("md: disconnect device from personality before trying to remove it.") Cc: stable@xxxxxxxxxxxxxx (v4.8+) Signed-off-by: NeilBrown <neilb@xxxxxxxx> --- drivers/md/md.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index b3192943de7d..01eac0dbca98 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8563,6 +8563,10 @@ static int remove_and_add_spares(struct mddev *mddev, int removed = 0; bool remove_some = false; + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) + /* Mustn't remove devices when resync thread is running */ + return 0; + rdev_for_each(rdev, mddev) { if ((this == NULL || rdev == this) && rdev->raid_disk >= 0 && -- 2.14.0.rc0.dirty
Attachment:
signature.asc
Description: PGP signature