Re: [PATCH] block: cope with WRITE SAME failing in blkdev_issue_zeroout()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 13, 2017 at 11:50 AM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> On Wed, Sep 6, 2017 at 7:38 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
>> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
>> is set.  This means blkdev_issue_zeroout() must cope with WRITE SAME
>> failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes,
>> unless BLKDEV_ZERO_NOFALLBACK is specified.
>>
>> Commit 71027e97d796 ("block: stop using discards for zeroing") added
>> the following to __blkdev_issue_zeroout() comment:
>>
>>   "Note that this function may fail with -EOPNOTSUPP if the driver signals
>>   zeroing offload support, but the device fails to process the command (for
>>   some devices there is no non-destructive way to verify whether this
>>   operation is actually supported).  In this case the caller should call
>>   retry the call to blkdev_issue_zeroout() and the fallback path will be used."
>>
>> But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME
>> support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned
>> to blkdev_issue_zeroout().  -EREMOTEIO is then propagated up:
>>
>>   $ fallocate -zn -l 1k /dev/sdg
>>   fallocate: fallocate failed: Remote I/O error
>>   $ fallocate -zn -l 1k /dev/sdg  # OK
>>
>> (The second fallocate(1) succeeds because sd_done() sets ->no_write_same
>> in response to a sense that would become BLK_STS_TARGET.)
>>
>> Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO.  This
>> is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob.
>>
>> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
>> Cc: Christoph Hellwig <hch@xxxxxx>
>> Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx>
>> Cc: Hannes Reinecke <hare@xxxxxxxx>
>> Cc: stable@xxxxxxxxxxxxxxx # 4.12+
>> Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
>> ---
>>  block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++----------------
>>  1 file changed, 33 insertions(+), 16 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 3fe0aec90597..876b5478c1a2 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -287,12 +287,6 @@ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
>>   *  Zero-fill a block range, either using hardware offload or by explicitly
>>   *  writing zeroes to the device.
>>   *
>> - *  Note that this function may fail with -EOPNOTSUPP if the driver signals
>> - *  zeroing offload support, but the device fails to process the command (for
>> - *  some devices there is no non-destructive way to verify whether this
>> - *  operation is actually supported).  In this case the caller should call
>> - *  retry the call to blkdev_issue_zeroout() and the fallback path will be used.
>> - *
>>   *  If a device is using logical block provisioning, the underlying space will
>>   *  not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
>>   *
>> @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>>  }
>>  EXPORT_SYMBOL(__blkdev_issue_zeroout);
>>
>> +/*
>> + * This function may fail with -EREMOTEIO if the driver signals zeroing
>> + * offload support, but the device fails to process the command (for
>> + * some devices there is no non-destructive way to verify whether this
>> + * operation is actually supported).  In this case the caller should
>> + * retry and the fallback path in __blkdev_issue_zeroout() will be used
>> + * unless %flags contains BLKDEV_ZERO_NOFALLBACK.
>> + */
>> +static int __blkdev_issue_zeroout_wait(struct block_device *bdev,
>> +                                      sector_t sector, sector_t nr_sects,
>> +                                      gfp_t gfp_mask, unsigned flags)
>> +{
>> +       int ret;
>> +       struct bio *bio = NULL;
>> +       struct blk_plug plug;
>> +
>> +       blk_start_plug(&plug);
>> +       ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
>> +                       &bio, flags);
>> +       if (ret == 0 && bio) {
>> +               ret = submit_bio_wait(bio);
>> +               bio_put(bio);
>> +       }
>> +       blk_finish_plug(&plug);
>> +
>> +       return ret;
>> +}
>> +
>>  /**
>>   * blkdev_issue_zeroout - zero-fill a block range
>>   * @bdev:      blockdev to write
>> @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>>                 sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
>>  {
>>         int ret;
>> -       struct bio *bio = NULL;
>> -       struct blk_plug plug;
>>
>> -       blk_start_plug(&plug);
>> -       ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
>> -                       &bio, flags);
>> -       if (ret == 0 && bio) {
>> -               ret = submit_bio_wait(bio);
>> -               bio_put(bio);
>> -       }
>> -       blk_finish_plug(&plug);
>> +       ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects,
>> +                                         gfp_mask, flags);
>> +       if (ret == -EREMOTEIO)
>> +               ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects,
>> +                                                 gfp_mask, flags);
>>
>>         return ret;
>>  }
>
> Ping...  This fixes a regression introduced in 4.12.  Christoph,
> Martin, could you please take a look?

Christoph, the regression got introduced with your "always use
REQ_OP_WRITE_ZEROES for zeroing offload" series, could you please look
at this?

Thanks,

                Ilya



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux