On 2020/7/30 下午4:30, Jeffle Xu wrote:
Before commit eded341c085b ("block: don't decrement nr_phys_segments for
physically contigous segments") applied, the generic block layer may not
guarantee that @req->nr_phys_segments equals the number of bios in the
request. When limits.@max_discard_segments == 1 and the IO scheduler is
set to scheduler except for "none" (mq-deadline, kyber or bfq, e.g.),
the requests merge routine may be called when enqueuing a DISCARD bio.
When merging two requests, @req->nr_phys_segments will minus one if the
middle two bios of the requests can be merged into one physical segment,
causing @req->nr_phys_segments one less than the number of bios in the
DISCARD request.
In this case, we are in risk of overruning virtio_blk_discard_write_zeroes
buffers. Though this issue has been fixed by commit eded341c085b
("block: don't decrement nr_phys_segments for physically contigous segments"),
it can recure once the generic block layer breaks the guarantee as code
refactoring.
commit 8cb6af7b3a6d ("nvme: Fix discard buffer overrun") has fixed the similar
issue in nvme driver. This patch is also inspired by this commit.
Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx>
Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx>
---
drivers/block/virtio_blk.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 980df853ee49..248c8f46b51c 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -130,12 +130,19 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap)
u64 sector = bio->bi_iter.bi_sector;
u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
- range[n].flags = cpu_to_le32(flags);
- range[n].num_sectors = cpu_to_le32(num_sectors);
- range[n].sector = cpu_to_le64(sector);
+ if (n < segments) {
+ range[n].flags = cpu_to_le32(flags);
+ range[n].num_sectors = cpu_to_le32(num_sectors);
+ range[n].sector = cpu_to_le64(sector);
+ }
Not familiar with block but if we start to duplicate checks like this,
it's a hint to move it in the core.
n++;
}
+ if (WARN_ON_ONCE(n != segments)) {
+ kfree(range);
+ return -EIO;
+ }
+
req->special_vec.bv_page = virt_to_page(range);
req->special_vec.bv_offset = offset_in_page(range);
req->special_vec.bv_len = sizeof(*range) * segments;
@@ -246,8 +253,14 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
if (type == VIRTIO_BLK_T_DISCARD || type == VIRTIO_BLK_T_WRITE_ZEROES) {
err = virtblk_setup_discard_write_zeroes(req, unmap);
- if (err)
- return BLK_STS_RESOURCE;
+ if (err) {
+ switch (err) {
+ case -ENOMEM:
+ return BLK_STS_RESOURCE;
+ default:
+ return BLK_STS_IOERR;
+ }
+ }
This looks not elegant, why not simple if (err == -ENOMEM) else if
(err) ...
Thanks
}
num = blk_rq_map_sg(hctx->queue, req, vbr->sg);
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization