+ md-change-lifetime-rules-for-md-devices.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     md: change lifetime rules for 'md' devices.
has been added to the -mm tree.  Its filename is
     md-change-lifetime-rules-for-md-devices.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

------------------------------------------------------
Subject: md: change lifetime rules for 'md' devices.
From: Neil Brown <neilb@xxxxxxx>

Currently md devices are created when first opened and remain in existence
until the module is unloaded.  This isn't a major problem, but it somewhat
ugly.

This patch changes the lifetime rules so that an md device will disappear
on the last close if it has no state.

Locking rules depend on bd_mutex being held in do_open and __blkdev_put,
and on setting bd_disk->private_data to 'mddev'.

There is room for a race because md_probe is called early in do_open
(get_gendisk) to create the mddev.  As this isn't protected by bd_mutex, a
concurrent call to md_close can destroy that mddev before do_open calls
md_open to get a reference on it.

md_open and md_close are serialised by md_mutex so the worst that can
happen is that md_open finds that the mddev structure doesn't exist after
all.  In this case bd_disk->private_data will be NULL, and md_open chooses
to exit with -EBUSY in this case, which is arguable and appropriate result.

The new 'dead' field in mddev is used to track whether it is time to
destroy the mddev (if a last-close happens).  It is cleared when any state
is create (set_array_info) and set when the array is stopped (do_md_stop).

mddev_put becomes simpler.  It just destroys the mddev when the refcount
hits zero.  This will normally be the reference held in
bd_disk->private_data.

Signed-off-by: Neil Brown <neilb@xxxxxxx>
Cc: "Rafael J. Wysocki" <rjw@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 drivers/md/md.c           |   32 +++++++++++++++++++++++---------
 include/linux/raid/md_k.h |    3 +++
 2 files changed, 26 insertions(+), 9 deletions(-)

diff -puN drivers/md/md.c~md-change-lifetime-rules-for-md-devices drivers/md/md.c
--- a/drivers/md/md.c~md-change-lifetime-rules-for-md-devices
+++ a/drivers/md/md.c
@@ -226,13 +226,14 @@ static void mddev_put(mddev_t *mddev)
 {
 	if (!atomic_dec_and_lock(&mddev->active, &all_mddevs_lock))
 		return;
-	if (!mddev->raid_disks && list_empty(&mddev->disks)) {
-		list_del(&mddev->all_mddevs);
-		spin_unlock(&all_mddevs_lock);
-		blk_cleanup_queue(mddev->queue);
-		kobject_unregister(&mddev->kobj);
-	} else
-		spin_unlock(&all_mddevs_lock);
+	list_del(&mddev->all_mddevs);
+	spin_unlock(&all_mddevs_lock);
+
+	del_gendisk(mddev->gendisk);
+	mddev->gendisk = NULL;
+	blk_cleanup_queue(mddev->queue);
+	mddev->queue = NULL;
+	kobject_unregister(&mddev->kobj);
 }
 
 static mddev_t * mddev_find(dev_t unit)
@@ -273,6 +274,7 @@ static mddev_t * mddev_find(dev_t unit)
 	atomic_set(&new->active, 1);
 	spin_lock_init(&new->write_lock);
 	init_waitqueue_head(&new->sb_wait);
+	new->dead = 1;
 
 	new->queue = blk_alloc_queue(GFP_KERNEL);
 	if (!new->queue) {
@@ -1384,6 +1386,7 @@ static int bind_rdev_to_array(mdk_rdev_t
 		ko = &rdev->bdev->bd_disk->kobj;
 	sysfs_create_link(&rdev->kobj, ko, "block");
 	bd_claim_by_disk(rdev->bdev, rdev, mddev->gendisk);
+	mddev->dead = 0;
 	return 0;
 }
 
@@ -3360,6 +3363,8 @@ static int do_md_stop(mddev_t * mddev, i
 		mddev->array_size = 0;
 		mddev->size = 0;
 		mddev->raid_disks = 0;
+		mddev->dead = 1;
+
 		mddev->recovery_cp = 0;
 
 	} else if (mddev->pers)
@@ -4022,6 +4027,7 @@ static int set_array_info(mddev_t * mdde
 	mddev->new_layout = mddev->layout;
 	mddev->delta_disks = 0;
 
+	mddev->dead = 0;
 	return 0;
 }
 
@@ -4422,8 +4428,12 @@ static int md_open(struct inode *inode, 
 	 * Succeed if we can lock the mddev, which confirms that
 	 * it isn't being stopped right now.
 	 */
-	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
-	int err;
+	mddev_t *mddev;
+	int err = -EBUSY;
+
+	mddev = inode->i_bdev->bd_disk->private_data;
+	if (!mddev)
+		goto out;
 
 	if ((err = mutex_lock_interruptible_nested(&mddev->reconfig_mutex, 1)))
 		goto out;
@@ -4442,6 +4452,10 @@ static int md_release(struct inode *inod
  	mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
 
 	BUG_ON(!mddev);
+	if (inode->i_bdev->bd_openers == 0 && mddev->dead) {
+		inode->i_bdev->bd_disk->private_data = NULL;
+		mddev_put(mddev);
+	}
 	mddev_put(mddev);
 
 	return 0;
diff -puN include/linux/raid/md_k.h~md-change-lifetime-rules-for-md-devices include/linux/raid/md_k.h
--- a/include/linux/raid/md_k.h~md-change-lifetime-rules-for-md-devices
+++ a/include/linux/raid/md_k.h
@@ -119,6 +119,9 @@ struct mddev_s
 #define MD_CHANGE_PENDING 2	/* superblock update in progress */
 
 	int				ro;
+	int				dead; /* array should be discarded on
+					       * last close
+					       */
 
 	struct gendisk			*gendisk;
 
_

Patches currently in -mm which might be from neilb@xxxxxxx are

origin.patch
auth_gss-unregister-gss_domain-when-unloading-module-fix.patch
fix-sunrpc-wakeup-execute-race-condition.patch
lockdep-annotate-nfs-nfsd-in-kernel-sockets.patch
lockdep-annotate-nfs-nfsd-in-kernel-sockets-tidy.patch
remove-lock_key-approach-to-managing-nested-bd_mutex-locks.patch
simplify-some-aspects-of-bd_mutex-nesting.patch
use-mutex_lock_nested-for-bd_mutex-to-avoid-lockdep-warning.patch
avoid-lockdep-warning-in-md.patch
bdev-fix-bd_part_count-leak.patch
lockdep-annotate-nfsd4-recover-code.patch
md-tidy-up-device-change-notification-when-an-md-array-is-stopped.patch
md-define-raid5_mergeable_bvec.patch
md-handle-bypassing-the-read-cache-assuming-nothing-fails.patch
md-allow-reads-that-have-bypassed-the-cache-to-be-retried-on-failure.patch
md-allow-reads-that-have-bypassed-the-cache-to-be-retried-on-failure-fix.patch
md-enable-bypassing-cache-for-reads.patch
md-conditionalize-some-code.patch
md-change-lifetime-rules-for-md-devices.patch
md-dm-reduce-stack-usage-with-stacked-block-devices.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux