[PATCH] DM RAID: Fix inability to handle failed devices during start-up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



DM RAID: Fix inability to handle failed devices during start-up

The code fails to create a RAID array if any of the superblocks cannot be read.
However, if the userspace program passing in the device-mapper table spots the
failure and passes in the sentinel values ('- -') for the failed array position,
then everything is handled properly.  This functional gap should not exist.

This patch makes it so that if a superblock cannot be read, the array position's
fields are initialized as though '- -' were passed in via the CTR table.  That
is, the device is failed and the position should not be used; but if there is
sufficient redundancy, the array should still be activated.

Signed-off-by: Jonathan Brassow <jbrassow@xxxxxxxxxx>

Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -603,7 +603,9 @@ static int read_disk_sb(struct md_rdev *
 		return 0;
 
 	if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, 1)) {
-		DMERR("Failed to read device superblock");
+		DMERR("Failed to read superblock of device at position %d",
+		      rdev->raid_disk);
+		set_bit(Faulty, &rdev->flags);
 		return -EINVAL;
 	}
 
@@ -854,9 +856,25 @@ static int super_validate(struct mddev *
 static int analyse_superblocks(struct dm_target *ti, struct raid_set *rs)
 {
 	int ret;
+	unsigned redundancy = 0;
+	struct raid_dev *dev;
 	struct md_rdev *rdev, *freshest, *tmp;
 	struct mddev *mddev = &rs->md;
 
+	switch (rs->raid_type->level) {
+	case 1:
+		redundancy = rs->md.raid_disks - 1;
+		break;
+	case 4:
+	case 5:
+	case 6:
+		redundancy = rs->raid_type->parity_devs;
+		break;
+	default:
+		ti->error = "Unknown RAID type";
+		return -EINVAL;
+	}
+
 	freshest = NULL;
 	rdev_for_each(rdev, tmp, mddev) {
 		if (!rdev->meta_bdev)
@@ -871,6 +889,34 @@ static int analyse_superblocks(struct dm
 		case 0:
 			break;
 		default:
+			dev = container_of(rdev, struct raid_dev, rdev);
+			if (redundancy--) {
+				if (dev->meta_dev)
+					dm_put_device(ti, dev->meta_dev);
+				dev->meta_dev = NULL;
+				rdev->meta_bdev = NULL;
+
+				if (rdev->sb_page)
+					put_page(rdev->sb_page);
+				rdev->sb_page = NULL;
+
+				rdev->sb_loaded = 0;
+
+				/*
+				 * We might be able to salvage the data device
+				 * even though the meta device has failed.  For
+				 * now, we behave as though '- -' had been
+				 * passed in for this device via the CTR table.
+				 */
+				if (dev->data_dev)
+					dm_put_device(ti, dev->data_dev);
+				dev->data_dev = NULL;
+				rdev->bdev = NULL;
+
+				list_del(&rdev->same_set);
+
+				continue;
+			}
 			ti->error = "Failed to load superblock";
 			return ret;
 		}


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel


[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux