On Thu, Jan 23, 2025 at 12:32 AM Eugenio Pérez <eperezma@xxxxxxxxxx> wrote: > > This is useful for some setups like swiotlb or VDUSE where the DMA > operations are expensive and/or need to be performed with a write lock. > > After applying this patch, fio read test goes from 1201MiB/s to 1211MiB/s. The difference is too small to differentiate it from the noise. I would suggest to test with different setups. 1) SWIOTLB 2) VDUSE Note that SWIOTLB will do bouncing even for DMA_FROM_DEVICE, so I think we may see better performance there. And we need to try with different request size, I did a similar patch for virtio-blk and I see better performance for large request like 1M etc. Thanks > > Signed-off-by: Eugenio Pérez <eperezma@xxxxxxxxxx> > --- > drivers/virtio/virtio_ring.c | 2 ++ > fs/fuse/virtio_fs.c | 25 +++++++++++++++++++++++-- > 2 files changed, 25 insertions(+), 2 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index e49912fa77c5..eb22bfcb9100 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -580,6 +580,7 @@ int virtqueue_map_sgs(struct virtqueue *_vq, > goto unmap_release; > > sg_dma_address(sg) = addr; > + sg_dma_len(sg) = sg->length; > mapped_sg++; > } > } > @@ -592,6 +593,7 @@ int virtqueue_map_sgs(struct virtqueue *_vq, > goto unmap_release; > > sg_dma_address(sg) = addr; > + sg_dma_len(sg) = sg->length; > mapped_sg++; > } > } > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > index 1344c5782a7c..2b558b05d0f8 100644 > --- a/fs/fuse/virtio_fs.c > +++ b/fs/fuse/virtio_fs.c > @@ -836,8 +836,21 @@ static void virtio_fs_requests_done_work(struct work_struct *work) > > /* End requests */ > list_for_each_entry_safe(req, next, &reqs, list) { > + struct scatterlist *stack_sgs[6]; > + struct scatterlist **sgs = stack_sgs; > + unsigned int total_sgs = req->out_sgs + req->in_sgs; > + > list_del_init(&req->list); > > + /* TODO replace magic 6 by a macro */ > + if (total_sgs > 6) > + sgs = kmalloc_array(total_sgs, sizeof(sgs[0]), GFP_ATOMIC); > + > + for (unsigned int i = 0; i < total_sgs; ++i) > + sgs[i] = &req->sg[i]; > + > + virtqueue_unmap_sgs(vq, sgs, req->out_sgs, req->in_sgs); > + > /* blocking async request completes in a worker context */ > if (req->args->may_block) { > struct virtio_fs_req_work *w; > @@ -850,6 +863,9 @@ static void virtio_fs_requests_done_work(struct work_struct *work) > } else { > virtio_fs_request_complete(req, fsvq); > } > + > + if (sgs != stack_sgs) > + kfree(sgs); > } > > /* Try to push previously queued requests, as the queue might no longer be full */ > @@ -1426,6 +1442,11 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, > sgs[i] = &req->sg[i]; > WARN_ON(req->out_sgs + req->in_sgs != total_sgs); > > + // TODO can we change this ptr out of the lock? > + vq = fsvq->vq; > + // TODO handle this and following errors > + ret = virtqueue_map_sgs(vq, sgs, req->out_sgs, req->in_sgs); > + BUG_ON(ret < 0); > spin_lock(&fsvq->lock); > > if (!fsvq->connected) { > @@ -1434,8 +1455,8 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, > goto out; > } > > - vq = fsvq->vq; > - ret = virtqueue_add_sgs(vq, sgs, req->out_sgs, req->in_sgs, req, GFP_ATOMIC); > + ret = virtqueue_add_sgs_premapped(vq, sgs, req->out_sgs, > + req->in_sgs, req, GFP_ATOMIC); > if (ret < 0) { > spin_unlock(&fsvq->lock); > goto out; > -- > 2.48.1 >