Hi Ming, On Wed, Aug 12, 2020 at 07:44:19AM +0800, Ming Lei wrote: > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > to support multi-range discard for virtio-blk. However, the virtio-blk > disk may report max discard segment as 1, at least that is exactly what > qemu is doing. > > So far, block layer switches to normal request merge if max discard segment > limit is 1, and multiple bios can be merged to single segment. This way may > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > Fix the issue by handling single max discard segment in straightforward > way. > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: Changpeng Liu <changpeng.liu@xxxxxxxxx> > Cc: Daniel Verkamp <dverkamp@xxxxxxxxxxxx> > Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> > Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx> > Cc: Stefano Garzarella <sgarzare@xxxxxxxxxx> > --- > drivers/block/virtio_blk.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 63b213e00b37..b2e48dac1ebd 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -126,16 +126,31 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > if (!range) > return -ENOMEM; > > - __rq_for_each_bio(bio, req) { > - u64 sector = bio->bi_iter.bi_sector; > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > - > - range[n].flags = cpu_to_le32(flags); > - range[n].num_sectors = cpu_to_le32(num_sectors); > - range[n].sector = cpu_to_le64(sector); > - n++; > + /* > + * Single max discard segment means multi-range discard isn't > + * supported, and block layer only runs contiguity merge like > + * normal RW request. So we can't reply on bio for retrieving > + * each range info. > + */ > + if (queue_max_discard_segments(req->q) == 1) { > + range[0].flags = cpu_to_le32(flags); > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > + n = 1; > + } else { > + __rq_for_each_bio(bio, req) { > + u64 sector = bio->bi_iter.bi_sector; > + u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + range[n].flags = cpu_to_le32(flags); > + range[n].num_sectors = cpu_to_le32(num_sectors); > + range[n].sector = cpu_to_le64(sector); > + n++; > + } > } > > + WARN_ON_ONCE(n != segments); I wonder should we return an error if the discard segments are incorrect like NVMe did[1]? In case the DMA may do some serious damages in this case. [1] https://elixir.bootlin.com/linux/v5.8-rc7/source/drivers/nvme/host/core.c#L638 > + > req->special_vec.bv_page = virt_to_page(range); > req->special_vec.bv_offset = offset_in_page(range); > req->special_vec.bv_len = sizeof(*range) * segments; > -- > 2.25.2