On Mon, Jun 11, 2018 at 03:37:00AM +0000, Liu, Changpeng wrote: > > > > -----Original Message----- > > From: Stefan Hajnoczi [mailto:stefanha@xxxxxxxxxx] > > Sent: Friday, June 8, 2018 6:20 PM > > To: Liu, Changpeng <changpeng.liu@xxxxxxxxx> > > Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx; cavery@xxxxxxxxxx; > > jasowang@xxxxxxxxxx; pbonzini@xxxxxxxxxx; Wang, Wei W > > <wei.w.wang@xxxxxxxxx> > > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > > support > > > > On Thu, Jun 07, 2018 at 11:07:06PM +0000, Liu, Changpeng wrote: > > > > > > > > > > -----Original Message----- > > > > From: Stefan Hajnoczi [mailto:stefanha@xxxxxxxxxx] > > > > Sent: Thursday, June 7, 2018 9:10 PM > > > > To: Liu, Changpeng <changpeng.liu@xxxxxxxxx> > > > > Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx; cavery@xxxxxxxxxx; > > > > jasowang@xxxxxxxxxx; pbonzini@xxxxxxxxxx; Wang, Wei W > > > > <wei.w.wang@xxxxxxxxx> > > > > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > > > > support > > > > > > > > On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote: > > > > > Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands > > > > > support, this will impact the performance when using SSD backend over > > > > > file systems. > > > > > > > > > > Commit 88c85538 "virtio-blk: add discard and write zeroes features to > > > > > specification"(see https://github.com/oasis-tcs/virtio-spec) extended > > > > > existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES > > > > > commands support. > > > > > > > > > > While here, using 16 bytes descriptor to describe one segment of DISCARD > > > > > or WRITE ZEROES commands, each command may contain one or more > > > > decriptors. > > > > > > > > > > The following data structure shows the definition of one descriptor: > > > > > > > > > > struct virtio_blk_discard_write_zeroes { > > > > > le64 sector; > > > > > le32 num_sectors; > > > > > le32 unmap; > > > > > }; > > > > > > > > > > Field 'sector' means the start sector for DISCARD and WRITE ZEROES, > > > > > filed 'num_sectors' means the number of sectors for DISCARD and WRITE > > > > > ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap' > > > > > maybe used for WRITE ZEROES command with DISCARD enabled. > > > > > > > > > > We also extended the virtio-blk configuration space to let backend > > > > > device put DISCARD and WRITE ZEROES configuration parameters. > > > > > > > > > > struct virtio_blk_config { > > > > > [...] > > > > > > > > > > le32 max_discard_sectors; > > > > > le32 max_discard_seg; > > > > > le32 discard_sector_alignment; > > > > > le32 max_write_zeroes_sectors; > > > > > le32 max_write_zeroes_seg; > > > > > u8 write_zeroes_may_unmap; > > > > > } > > > > > > > > > > New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard > > > > > command, maximum discard sectors size in field 'max_discard_sectors' and > > > > > maximum discard segment number in field 'max_discard_seg'. > > > > > > > > > > New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write > > > > > zeroes command, maximum write zeroes sectors size in field > > > > > 'max_write_zeroes_sectors' and maximum write zeroes segment number in > > > > > field 'max_write_zeroes_seg'. > > > > > > > > > > The parameters in the configuration space of the device field > > > > > 'max_discard_sectors' and field 'discard_sector_alignment' are expressed in > > > > > 512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The > > > > > field 'max_write_zeroes_sectors' is expressed in 512-byte units if the > > > > > VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated. > > > > > > > > > > Signed-off-by: Changpeng Liu <changpeng.liu@xxxxxxxxx> > > > > > --- > > > > > CHANGELOG: > > > > > v6: don't set T_OUT bit to discard and write zeroes commands. > > > > > > > > I don't see this in the patch... > > > Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUT > > again. > > > > > > > > > @@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(struct > > > > blk_mq_hw_ctx *hctx, > > > > > int qid = hctx->queue_num; > > > > > int err; > > > > > bool notify = false; > > > > > + bool unmap = false; > > > > > u32 type; > > > > > > > > > > BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > > > > > @@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(struct > > > > blk_mq_hw_ctx *hctx, > > > > > case REQ_OP_FLUSH: > > > > > type = VIRTIO_BLK_T_FLUSH; > > > > > break; > > > > > + case REQ_OP_DISCARD: > > > > > + type = VIRTIO_BLK_T_DISCARD; > > > > > + break; > > > > > + case REQ_OP_WRITE_ZEROES: > > > > > + type = VIRTIO_BLK_T_WRITE_ZEROES; > > > > > + unmap = !(req->cmd_flags & REQ_NOUNMAP); > > > > > + break; > > > > > case REQ_OP_SCSI_IN: > > > > > case REQ_OP_SCSI_OUT: > > > > > type = VIRTIO_BLK_T_SCSI_CMD; > > > > > @@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(struct > > > > blk_mq_hw_ctx *hctx, > > > > > > > > > > blk_mq_start_request(req); > > > > > > > > > > + if (type == VIRTIO_BLK_T_DISCARD || type == > > > > VIRTIO_BLK_T_WRITE_ZEROES) { > > > > > + err = virtblk_setup_discard_write_zeroes(req, unmap); > > > > > + if (err) > > > > > + return BLK_STS_RESOURCE; > > > > > + } > > > > > + > > > > > num = blk_rq_map_sg(hctx->queue, req, vbr->sg); > > > > > if (num) { > > > > > if (rq_data_dir(req) == WRITE) > > > > > > > > ...since we still do blk_rq_map_sg() here and num should be != 0. > > > No, while here, we should keep the original logic for READ/WRITE commands. > > > > My question is: why does the changelog say "don't set T_OUT" but the > > code *will* set it because blk_rq_map_sg() returns != 0 and > > rq_data_dir(req) == WRITE? > Since the last bit of DISCARD/WRITE ZEROES commands are already 1, so I said we don't need to set > T_OUT bit to DISCARD/WRITE ZEROES commands again. But the original logic for WRITE, T_OUT is still > needed, so just keep the original code here is fine. Okay, I understand what you meant now. Thanks! Stefan
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization