On Sun, Dec 4, 2016 at 7:41 PM, NeilBrown <neilb@xxxxxxxx> wrote: > On Sat, Dec 03 2016, Marc Smith wrote: > >> Finally, I got it! Why is it when I want it to break, it doesn't. =) > > welcome to my world :-) > > >> >> I will say, using the modified mdadm that prevents the synthesized >> CHANGE event, it seems to not induce the problem as regularly. >> >> Below are the kernel logs after stopping an array: > > Thank you so much for persisting with this. > The logs you provide make it clear that two separate processes (494 and > 31178) increment the ->active count by opening the device, but never > decrement that count by closing the device. > It seems too unlikely that either process would be holding the > file descriptor open indefinitely, so something must be going wrong > either as part of 'open', or as part of 'close'. > > Now that I know where to look, the bug is obvious. Why didn't I see > that before? > > The open request is failing, almost certainly because MD_CLOSING is set, > but the ->active count isn't being decremented on failure. > This patch should fix it. > > Please test and report results. > > Thanks, > NeilBrown > > Fixes: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag" v4.9-rc1) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 2089d46b0eb8..a8e07eb2ca5f 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -7087,11 +7087,14 @@ static int md_open(struct block_device *bdev, fmode_t mode) > } > BUG_ON(mddev != bdev->bd_disk->private_data); > > - if ((err = mutex_lock_interruptible(&mddev->open_mutex))) > + if ((err = mutex_lock_interruptible(&mddev->open_mutex))) { > + mddev_put(mddev); > goto out; > + } > > if (test_bit(MD_CLOSING, &mddev->flags)) { > mutex_unlock(&mddev->open_mutex); > + mddev_put(mddev); > return -ENODEV; > } > That did the trick! I ran 'mdadm --stop' eight different times on two nodes, and every time it was removed completely from /dev and /sys/block just like expected. =) Thanks for your effort and time on this. I think I read in the other thread RE: this patch, it may make it into 4.9? --Marc -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html