Re: [PATCH 1/3] block: try one write zeroes request before going further

Tom Yan <tom.ty89@xxxxxxxxx> · Sun, 6 Dec 2020 21:25:46 +0800

I think you misunderstood it. The goal of this patch is to split the
current situation into two chains (or one unchained bio + a series of
chained bio). The first one is an attempt/trial which makes sure that
the latter large bio chain can actually be handled (as per the
"command capability" of the device).

P.S. I think I missed the fact that it requires my blk_next_bio()
patch to work properly. (It still seems like a typo bug to me.)

On Sun, 6 Dec 2020 at 19:25, Hannes Reinecke <hare@xxxxxxx> wrote:
>
> On 12/6/20 6:53 AM, Tom Yan wrote:
> > At least the SCSI disk driver is "benevolent" when it try to decide
> > whether the device actually supports write zeroes, i.e. unless the
> > device explicity report otherwise, it assumes it does at first.
> >
> > Therefore before we pile up bios that would fail at the end, we try
> > the command/request once, as not doing so could trigger quite a
> > disaster in at least certain case. For example, the host controller
> > can be messed up entirely when one does `blkdiscard -z` a UAS drive.
> >
> > Signed-off-by: Tom Yan <tom.ty89@xxxxxxxxx>
> > ---
> >   block/blk-lib.c | 14 +++++++++++++-
> >   1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/blk-lib.c b/block/blk-lib.c
> > index e90614fd8d6a..c1e9388a8fb8 100644
> > --- a/block/blk-lib.c
> > +++ b/block/blk-lib.c
> > @@ -250,6 +250,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
> >       struct bio *bio = *biop;
> >       unsigned int max_write_zeroes_sectors;
> >       struct request_queue *q = bdev_get_queue(bdev);
> > +     int i = 0;
> >
> >       if (!q)
> >               return -ENXIO;
> > @@ -264,7 +265,17 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
> >               return -EOPNOTSUPP;
> >
> >       while (nr_sects) {
> > -             bio = blk_next_bio(bio, 0, gfp_mask);
> > +             if (i != 1) {
> > +                     bio = blk_next_bio(bio, 0, gfp_mask);
> > +             } else {
> > +                     submit_bio_wait(bio);
> > +                     bio_put(bio);
> > +
> > +                     if (bdev_write_zeroes_sectors(bdev) == 0)
> > +                             return -EOPNOTSUPP;
> > +                     else
> > +                             bio = bio_alloc(gfp_mask, 0);
> > +             }
> >               bio->bi_iter.bi_sector = sector;
> >               bio_set_dev(bio, bdev);
> >               bio->bi_opf = REQ_OP_WRITE_ZEROES;
> > @@ -280,6 +291,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
> >                       nr_sects = 0;
> >               }
> >               cond_resched();
> > +             i++;
> >       }
> >
> >       *biop = bio;
> >
> We do want to keep the chain of bios intact such that end_io processing
> will recurse back to the original end_io callback.
> As such we need to call bio_chain on the first bio, submit that
> (possibly with submit_bio_wait()), and then decide whether we can /
> should continue.
> With your patch we'll lose the information that indeed other bios might
> be linked to the original one.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                Kernel Storage Architect
> hare@xxxxxxx                              +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer