On Thu, Jan 23, 2025 at 9:56 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Thu, Jan 23, 2025 at 12:32 AM Eugenio Pérez <eperezma@xxxxxxxxxx> wrote: > > > > This is useful for some setups like swiotlb or VDUSE where the DMA > > operations are expensive and/or need to be performed with a write lock. > > > > After applying this patch, fio read test goes from 1201MiB/s to 1211MiB/s. > > The difference is too small to differentiate it from the noise. > > I would suggest to test with different setups. > > 1) SWIOTLB 2) VDUSE > > Note that SWIOTLB will do bouncing even for DMA_FROM_DEVICE, I meant dma map in this case actually. Thanks > so I > think we may see better performance there. > > And we need to try with different request size, I did a similar patch > for virtio-blk and I see better performance for large request like 1M > etc. > > Thanks > > > > > Signed-off-by: Eugenio Pérez <eperezma@xxxxxxxxxx> > > --- > > drivers/virtio/virtio_ring.c | 2 ++ > > fs/fuse/virtio_fs.c | 25 +++++++++++++++++++++++-- > > 2 files changed, 25 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index e49912fa77c5..eb22bfcb9100 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -580,6 +580,7 @@ int virtqueue_map_sgs(struct virtqueue *_vq, > > goto unmap_release; > > > > sg_dma_address(sg) = addr; > > + sg_dma_len(sg) = sg->length; > > mapped_sg++; > > } > > } > > @@ -592,6 +593,7 @@ int virtqueue_map_sgs(struct virtqueue *_vq, > > goto unmap_release; > > > > sg_dma_address(sg) = addr; > > + sg_dma_len(sg) = sg->length; > > mapped_sg++; > > } > > } > > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > > index 1344c5782a7c..2b558b05d0f8 100644 > > --- a/fs/fuse/virtio_fs.c > > +++ b/fs/fuse/virtio_fs.c > > @@ -836,8 +836,21 @@ static void virtio_fs_requests_done_work(struct work_struct *work) > > > > /* End requests */ > > list_for_each_entry_safe(req, next, &reqs, list) { > > + struct scatterlist *stack_sgs[6]; > > + struct scatterlist **sgs = stack_sgs; > > + unsigned int total_sgs = req->out_sgs + req->in_sgs; > > + > > list_del_init(&req->list); > > > > + /* TODO replace magic 6 by a macro */ > > + if (total_sgs > 6) > > + sgs = kmalloc_array(total_sgs, sizeof(sgs[0]), GFP_ATOMIC); > > + > > + for (unsigned int i = 0; i < total_sgs; ++i) > > + sgs[i] = &req->sg[i]; > > + > > + virtqueue_unmap_sgs(vq, sgs, req->out_sgs, req->in_sgs); > > + > > /* blocking async request completes in a worker context */ > > if (req->args->may_block) { > > struct virtio_fs_req_work *w; > > @@ -850,6 +863,9 @@ static void virtio_fs_requests_done_work(struct work_struct *work) > > } else { > > virtio_fs_request_complete(req, fsvq); > > } > > + > > + if (sgs != stack_sgs) > > + kfree(sgs); > > } > > > > /* Try to push previously queued requests, as the queue might no longer be full */ > > @@ -1426,6 +1442,11 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, > > sgs[i] = &req->sg[i]; > > WARN_ON(req->out_sgs + req->in_sgs != total_sgs); > > > > + // TODO can we change this ptr out of the lock? > > + vq = fsvq->vq; > > + // TODO handle this and following errors > > + ret = virtqueue_map_sgs(vq, sgs, req->out_sgs, req->in_sgs); > > + BUG_ON(ret < 0); > > spin_lock(&fsvq->lock); > > > > if (!fsvq->connected) { > > @@ -1434,8 +1455,8 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, > > goto out; > > } > > > > - vq = fsvq->vq; > > - ret = virtqueue_add_sgs(vq, sgs, req->out_sgs, req->in_sgs, req, GFP_ATOMIC); > > + ret = virtqueue_add_sgs_premapped(vq, sgs, req->out_sgs, > > + req->in_sgs, req, GFP_ATOMIC); > > if (ret < 0) { > > spin_unlock(&fsvq->lock); > > goto out; > > -- > > 2.48.1 > >