Patch "md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place" has been added to the 3.9-stable tree

<gregkh@xxxxxxxxxxxxxxxxxxx> · Fri, 14 Jun 2013 14:58:06 -0700

This is a note to let you know that I've just added the patch titled

    md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place

to the 3.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     md-raid1-5-10-disable-write-same-until-a-recovery-strategy-is-in-place.patch
and it can be found in the queue-3.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 5026d7a9b2f3eb1f9bda66c18ac6bc3036ec9020 Mon Sep 17 00:00:00 2001
From: "H. Peter Anvin" <hpa@xxxxxxxxx>
Date: Wed, 12 Jun 2013 07:37:43 -0700
Subject: md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place

From: "H. Peter Anvin" <hpa@xxxxxxxxx>

commit 5026d7a9b2f3eb1f9bda66c18ac6bc3036ec9020 upstream.

There are cases where the kernel will believe that the WRITE SAME
command is supported by a block device which does not, in fact,
support WRITE SAME.  This currently happens for SATA drivers behind a
SAS controller, but there are probably a hundred other ways that can
happen, including drive firmware bugs.

After receiving an error for WRITE SAME the block layer will retry the
request as a plain write of zeroes, but mdraid will consider the
failure as fatal and consider the drive failed.  This has the effect
that all the mirrors containing a specific set of data are each
offlined in very rapid succession resulting in data loss.

However, just bouncing the request back up to the block layer isn't
ideal either, because the whole initial request-retry sequence should
be inside the write bitmap fence, which probably means that md needs
to do its own conversion of WRITE SAME to write zero.

Until the failure scenario has been sorted out, disable WRITE SAME for
raid1, raid5, and raid10.

[neilb: added raid5]

This patch is appropriate for any -stable since 3.7 when write_same
support was added.

Signed-off-by: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx>
Signed-off-by: NeilBrown <neilb@xxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 drivers/md/raid1.c  |    4 ++--
 drivers/md/raid10.c |    3 +--
 drivers/md/raid5.c  |    4 +++-
 3 files changed, 6 insertions(+), 5 deletions(-)

--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2837,8 +2837,8 @@ static int run(struct mddev *mddev)
 		return PTR_ERR(conf);
 
 	if (mddev->queue)
-		blk_queue_max_write_same_sectors(mddev->queue,
-						 mddev->chunk_sectors);
+		blk_queue_max_write_same_sectors(mddev->queue, 0);
+
 	rdev_for_each(rdev, mddev) {
 		if (!mddev->gendisk)
 			continue;
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3635,8 +3635,7 @@ static int run(struct mddev *mddev)
 	if (mddev->queue) {
 		blk_queue_max_discard_sectors(mddev->queue,
 					      mddev->chunk_sectors);
-		blk_queue_max_write_same_sectors(mddev->queue,
-						 mddev->chunk_sectors);
+		blk_queue_max_write_same_sectors(mddev->queue, 0);
 		blk_queue_io_min(mddev->queue, chunk_size);
 		if (conf->geo.raid_disks % conf->geo.near_copies)
 			blk_queue_io_opt(mddev->queue, chunk_size * conf->geo.raid_disks);
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5457,7 +5457,7 @@ static int run(struct mddev *mddev)
 		if (mddev->major_version == 0 &&
 		    mddev->minor_version > 90)
 			rdev->recovery_offset = reshape_offset;
-			
+
 		if (rdev->recovery_offset < reshape_offset) {
 			/* We need to check old and new layout */
 			if (!only_parity(rdev->raid_disk,
@@ -5580,6 +5580,8 @@ static int run(struct mddev *mddev)
 		 */
 		mddev->queue->limits.discard_zeroes_data = 0;
 
+		blk_queue_max_write_same_sectors(mddev->queue, 0);
+
 		rdev_for_each(rdev, mddev) {
 			disk_stack_limits(mddev->gendisk, rdev->bdev,
 					  rdev->data_offset << 9);


Patches currently in stable-queue which might be from hpa@xxxxxxxxx are

queue-3.9/cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug.patch
queue-3.9/reboot-rigrate-shutdown-reboot-to-boot-cpu.patch
queue-3.9/md-raid1-5-10-disable-write-same-until-a-recovery-strategy-is-in-place.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html