Hi, it was brought to my attention that there are claims of data corruption caused by VMware's SCSI implementation. After investigating, problem seems to be in a way completion handler for WRITE_SAME handles EOPNOTSUPP error, causing all-but-first WRITE_SAME request on the LVM device to be silently ignored - command is never issued, but success is returned to higher layers. Problem affects all disks without WRITE_SAME support - and I guess VMware's SCSI emulation is one of few that do not support this command ATM. Please apply patch below. Thanks, Petr Vandrovec From: Petr Vandrovec <petr@xxxxxxxxxx> Subject: [PATCH] Do not silently discard WRITE_SAME requests When device does not support WRITE_SAME, after first failure block layer starts throwing away WRITE_SAME requests without warning anybody, leading to the data corruption. Let's do something about it - do not use EOPNOTSUPP error, as apparently that error code is special (use EREMOTEIO, AKA target failure, like when request hits hardware), and propagate inabiity to do WRITE_SAME to the top of stack, so we do not try to issue WRITE_SAME again and again. It also reverts 4089b71cc820a426d601283c92fcd4ffeb5139c2, as there is nothing wrong with VMware's WRITE_SAME emulation. Only problem was that block layer did not issue WRITE_SAME request at all, but reported success, and it affected all disks that do not support WRITE_SAME. Signed-off-by: Petr Vandrovec <petr@xxxxxxxxxx> Cc: Arvind Kumar <arvindkumar@xxxxxxxxxx> Cc: Chris J Arges <chris.j.arges@xxxxxxxxxxxxx> Cc: Martin K. Petersen <martin.petersen@xxxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxx> Cc: stable@xxxxxxxxxxxxxxx --- block/blk-core.c | 2 +- block/blk-lib.c | 10 ++++++++++ drivers/message/fusion/mptspi.c | 5 ----- 3 files changed, 11 insertions(+), 6 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9c888bd..b070782 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1822,7 +1822,7 @@ generic_make_request_checks(struct bio *bio) } if (bio->bi_rw & REQ_WRITE_SAME && !bdev_write_same(bio->bi_bdev)) { - err = -EOPNOTSUPP; + err = -EREMOTEIO; goto end_io; } diff --git a/block/blk-lib.c b/block/blk-lib.c index 8411be3..abad72d 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -298,6 +298,16 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, ZERO_PAGE(0))) return 0; + /* + * If WRITE_SAME failed, inability to perform WRITE_SAME was + * possibly recorded in device's queue by sd.c. But in case + * of LVM we are issuing request here on LVM device. So + * we should mark device as ineligible for WRITE_SAME here too, + * as otherwise we keep trying to submit WRITE_SAME again and + * again to LVM where they get promptly rejected by underlying + * disk queue. + */ + blk_queue_max_write_same_sectors(bdev_get_queue(bdev), 0); bdevname(bdev, bdn); pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn); } diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c index 613231c..787933d 100644 --- a/drivers/message/fusion/mptspi.c +++ b/drivers/message/fusion/mptspi.c @@ -1419,11 +1419,6 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id) goto out_mptspi_probe; } - /* VMWare emulation doesn't properly implement WRITE_SAME - */ - if (pdev->subsystem_vendor == 0x15AD) - sh->no_write_same = 1; - spin_lock_irqsave(&ioc->FreeQlock, flags); /* Attach the SCSI Host to the IOC structure -- 2.1.1 -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html