On Wed, Sep 6, 2017 at 7:38 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to > permit trying WRITE SAME on older SCSI devices, unless ->no_write_same > is set. This means blkdev_issue_zeroout() must cope with WRITE SAME > failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, > unless BLKDEV_ZERO_NOFALLBACK is specified. > > Commit 71027e97d796 ("block: stop using discards for zeroing") added > the following to __blkdev_issue_zeroout() comment: > > "Note that this function may fail with -EOPNOTSUPP if the driver signals > zeroing offload support, but the device fails to process the command (for > some devices there is no non-destructive way to verify whether this > operation is actually supported). In this case the caller should call > retry the call to blkdev_issue_zeroout() and the fallback path will be used." > > But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME > support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned > to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: > > $ fallocate -zn -l 1k /dev/sdg > fallocate: fallocate failed: Remote I/O error > $ fallocate -zn -l 1k /dev/sdg # OK > > (The second fallocate(1) succeeds because sd_done() sets ->no_write_same > in response to a sense that would become BLK_STS_TARGET.) > > Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This > is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. > > Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing") > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx> > Cc: Hannes Reinecke <hare@xxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # 4.12+ > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> > --- > block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++---------------- > 1 file changed, 33 insertions(+), 16 deletions(-) > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index 3fe0aec90597..876b5478c1a2 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -287,12 +287,6 @@ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) > * Zero-fill a block range, either using hardware offload or by explicitly > * writing zeroes to the device. > * > - * Note that this function may fail with -EOPNOTSUPP if the driver signals > - * zeroing offload support, but the device fails to process the command (for > - * some devices there is no non-destructive way to verify whether this > - * operation is actually supported). In this case the caller should call > - * retry the call to blkdev_issue_zeroout() and the fallback path will be used. > - * > * If a device is using logical block provisioning, the underlying space will > * not be released if %flags contains BLKDEV_ZERO_NOUNMAP. > * > @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, > } > EXPORT_SYMBOL(__blkdev_issue_zeroout); > > +/* > + * This function may fail with -EREMOTEIO if the driver signals zeroing > + * offload support, but the device fails to process the command (for > + * some devices there is no non-destructive way to verify whether this > + * operation is actually supported). In this case the caller should > + * retry and the fallback path in __blkdev_issue_zeroout() will be used > + * unless %flags contains BLKDEV_ZERO_NOFALLBACK. > + */ > +static int __blkdev_issue_zeroout_wait(struct block_device *bdev, > + sector_t sector, sector_t nr_sects, > + gfp_t gfp_mask, unsigned flags) > +{ > + int ret; > + struct bio *bio = NULL; > + struct blk_plug plug; > + > + blk_start_plug(&plug); > + ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, > + &bio, flags); > + if (ret == 0 && bio) { > + ret = submit_bio_wait(bio); > + bio_put(bio); > + } > + blk_finish_plug(&plug); > + > + return ret; > +} > + > /** > * blkdev_issue_zeroout - zero-fill a block range > * @bdev: blockdev to write > @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, > sector_t nr_sects, gfp_t gfp_mask, unsigned flags) > { > int ret; > - struct bio *bio = NULL; > - struct blk_plug plug; > > - blk_start_plug(&plug); > - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, > - &bio, flags); > - if (ret == 0 && bio) { > - ret = submit_bio_wait(bio); > - bio_put(bio); > - } > - blk_finish_plug(&plug); > + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, > + gfp_mask, flags); > + if (ret == -EREMOTEIO) > + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, > + gfp_mask, flags); > > return ret; > } Ping... This fixes a regression introduced in 4.12. Christoph, Martin, could you please take a look? Thanks, Ilya