Re: [PATCH 01/15] md: remove the code to flush an old instance in md_open

"heming.zhao@xxxxxxxx" <heming.zhao@xxxxxxxx> · Wed, 31 Mar 2021 11:29:39 +0800

On 3/31/21 12:17 AM, Christoph Hellwig wrote:
Due to the flush_workqueue() call in md_alloc no previous instance of
mddev can still be around at this point.

Signed-off-by: Christoph Hellwig <hch@xxxxxx>
---
  drivers/md/md.c | 35 +++++++----------------------------
  1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 368cad6cd53a6e..cd2d825dd4f881 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7807,43 +7807,22 @@ static int md_open(struct block_device *bdev, fmode_t mode)
  	 * Succeed if we can lock the mddev, which confirms that
  	 * it isn't being stopped right now.
  	 */
-	struct mddev *mddev = mddev_find(bdev->bd_dev);
+	struct mddev *mddev = bdev->bd_disk->private_data;
  	int err;
  
-	if (!mddev)
-		return -ENODEV;
-
-	if (mddev->gendisk != bdev->bd_disk) {
-		/* we are racing with mddev_put which is discarding this
-		 * bd_disk.
-		 */
-		mddev_put(mddev);
-		/* Wait until bdev->bd_disk is definitely gone */
-		if (work_pending(&mddev->del_work))
-			flush_workqueue(md_misc_wq);
-		/* Then retry the open from the top */
-		return -ERESTARTSYS;
-	}
-	BUG_ON(mddev != bdev->bd_disk->private_data);
-
-	if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
-		goto out;
-
+	err = mutex_lock_interruptible(&mddev->open_mutex);
+	if (err)
+		return err;
  	if (test_bit(MD_CLOSING, &mddev->flags)) {
  		mutex_unlock(&mddev->open_mutex);
-		err = -ENODEV;
-		goto out;
+		return -ENODEV;
  	}
-
-	err = 0;
+	mddev_get(mddev);
  	atomic_inc(&mddev->openers);
  	mutex_unlock(&mddev->open_mutex);
  
  	bdev_check_media_change(bdev);
- out:
-	if (err)
-		mddev_put(mddev);
-	return err;
+	return 0;
  }
  
  static void md_release(struct gendisk *disk, fmode_t mode)


Hello Christoph,

After applying your patch, the md_open() will be:
```
static int md_open(struct block_device *bdev, fmode_t mode)
{
    /* ...  */
    struct mddev *mddev = bdev->bd_disk->private_data;
    int err;

    err = mutex_lock_interruptible(&mddev->open_mutex);
    if (err)
        return err;

    if (test_bit(MD_CLOSING, &mddev->flags)) {
        mutex_unlock(&mddev->open_mutex);
        return -ENODEV;
    }

    mddev_get(mddev);
    atomic_inc(&mddev->openers);
    mutex_unlock(&mddev->open_mutex);

    bdev_check_media_change(bdev);
    return 0;
}
```

in clean path, MD_CLOSING only lives a very short time, then be cleaned in md_clean:
```
ioctl
 + test_and_set_bit(MD_CLOSING, &mddev->flags)
 + do_md_stop //case STOP_ARRAY
    md_clean
     mddev->flags = 0;
```

when userspace "mdadm -Ss" finish (the ioctl STOP_ARRAY returns),
mddev->flags will be zero. and you can see my patch email (date: 2021-3-30).
At this time, userspace will execute "mdadm --monitor" to scan the
closing md device. the md_open will trigger very soon. at this time,
bdev->bd_disk->private_data is only a skeleton, your shouldn't trust & use it.

So mddev with MD_CLOSING protection, the md_open is not safety.

PS:
Neil Brown legacy commit d3374825ce57ba2214d37502397 also describes this condition.

Thanks,
heming