On Wed, Aug 12, 2020 at 10:52:58AM +0800, Ming Lei wrote: > On Wed, Aug 12, 2020 at 10:07:06AM +0800, Baolin Wang wrote: > > Hi Ming, > > > > On Wed, Aug 12, 2020 at 07:44:19AM +0800, Ming Lei wrote: > > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > > to support multi-range discard for virtio-blk. However, the virtio-blk > > > disk may report max discard segment as 1, at least that is exactly what > > > qemu is doing. > > > > > > So far, block layer switches to normal request merge if max discard segment > > > limit is 1, and multiple bios can be merged to single segment. This way may > > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > > > Fix the issue by handling single max discard segment in straightforward > > > way. > > > > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > > Cc: Christoph Hellwig <hch@xxxxxx> > > > Cc: Changpeng Liu <changpeng.liu@xxxxxxxxx> > > > Cc: Daniel Verkamp <dverkamp@xxxxxxxxxxxx> > > > Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> > > > Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx> > > > Cc: Stefano Garzarella <sgarzare@xxxxxxxxxx> > > > --- > > > drivers/block/virtio_blk.c | 31 +++++++++++++++++++++++-------- > > > 1 file changed, 23 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > > index 63b213e00b37..b2e48dac1ebd 100644 > > > --- a/drivers/block/virtio_blk.c > > > +++ b/drivers/block/virtio_blk.c > > > @@ -126,16 +126,31 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > > if (!range) > > > return -ENOMEM; > > > > > > - __rq_for_each_bio(bio, req) { > > > - u64 sector = bio->bi_iter.bi_sector; > > > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > > - > > > - range[n].flags = cpu_to_le32(flags); > > > - range[n].num_sectors = cpu_to_le32(num_sectors); > > > - range[n].sector = cpu_to_le64(sector); > > > - n++; > > > + /* > > > + * Single max discard segment means multi-range discard isn't > > > + * supported, and block layer only runs contiguity merge like > > > + * normal RW request. So we can't reply on bio for retrieving > > > + * each range info. > > > + */ > > > + if (queue_max_discard_segments(req->q) == 1) { > > > + range[0].flags = cpu_to_le32(flags); > > > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > > > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > > > + n = 1; > > > + } else { > > > + __rq_for_each_bio(bio, req) { > > > + u64 sector = bio->bi_iter.bi_sector; > > > + u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > > + > > > + range[n].flags = cpu_to_le32(flags); > > > + range[n].num_sectors = cpu_to_le32(num_sectors); > > > + range[n].sector = cpu_to_le64(sector); > > > + n++; > > > + } > > > } > > > > > > + WARN_ON_ONCE(n != segments); > > > > I wonder should we return an error if the discard segments are > > incorrect like NVMe did[1]? In case the DMA may do some serious > > damages in this case. > > It is an unlikely case: > > 1) if queue_max_discard_segments() is 1, the warning can't be triggered > > 2) otherwise, ELEVATOR_DISCARD_MERGE is always handled in bio_attempt_discard_merge(), > and segment number is really same with number of bios in the request. > > If the warning is triggered, it is simply one serious bug in block > layer. > > BTW, suppose the warning is triggered: > > 1) if n < segments, it is simply one warning > > 2) if n > segments, no matter if something like nvme_setup_discard() is > done, serious memory corruption issue has been caused. > > So it doesn't matter to handle it in nvme's style. OK. Sounds reasonable. Thanks.