Re: [PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

Stefan Hajnoczi <stefanha@xxxxxxxxxx> · Thu, 1 Jun 2023 14:26:04 -0400

On Thu, Jun 01, 2023 at 04:49:06PM +0200, Peter-Jan Gootzen wrote:
> On 01/06/2023 16:08, Stefan Hajnoczi wrote:
> > On Wed, May 31, 2023 at 04:49:39PM -0400, Vivek Goyal wrote:
> >> On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen wrote:
> >>> On 31/05/2023 21:18, Vivek Goyal wrote:
> >>>> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
> >>>>> When the Virtio queue is full, a work item is scheduled
> >>>>> to execute in 1ms that retries adding the request to the queue.
> >>>>> This is a large amount of time on the scale on which a
> >>>>> virtio-fs device can operate. When using a DPU this is around
> >>>>> 40us baseline without going to a remote server (4k, QD=1).
> >>>>> This patch queues requests when the Virtio queue is full,
> >>>>> and when a completed request is taken off, immediately fills
> >>>>> it back up with queued requests.
> >>>>>
> >>>>> This reduces the 99.9th percentile latencies in our tests by
> >>>>> 60x and slightly increases the overall throughput, when using a
> >>>>> queue depth 2x the size of the Virtio queue size, with a
> >>>>> DPU-powered virtio-fs device.
> >>>>>
> >>>>> Signed-off-by: Peter-Jan Gootzen <peter-jan@xxxxxxxxxxx>
> >>>>> ---
> >>>>> V1 -> V2: Not scheduling dispatch work anymore when not needed
> >>>>> and changed delayed_work structs to work_struct structs
> >>>>>
> >>>>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
> >>>>>  1 file changed, 17 insertions(+), 15 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> >>>>> index 4d8d4f16c727..a676297db09b 100644
> >>>>> --- a/fs/fuse/virtio_fs.c
> >>>>> +++ b/fs/fuse/virtio_fs.c
> >>>>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
> >>>>>  	struct work_struct done_work;
> >>>>>  	struct list_head queued_reqs;
> >>>>>  	struct list_head end_reqs;	/* End these requests */
> >>>>> -	struct delayed_work dispatch_work;
> >>>>> +	struct work_struct dispatch_work;
> >>>>>  	struct fuse_dev *fud;
> >>>>>  	bool connected;
> >>>>>  	long in_flight;
> >>>>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq *fsvq)
> >>>>>  	}
> >>>>>  
> >>>>>  	flush_work(&fsvq->done_work);
> >>>>> -	flush_delayed_work(&fsvq->dispatch_work);
> >>>>> +	flush_work(&fsvq->dispatch_work);
> >>>>>  }
> >>>>>  
> >>>>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
> >>>>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work)
> >>>>>  			dec_in_flight_req(fsvq);
> >>>>>  		}
> >>>>>  	} while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
> >>>>> +
> >>>>> +	if (!list_empty(&fsvq->queued_reqs))
> >>>>> +		schedule_work(&fsvq->dispatch_work);
> >>>>>  	spin_unlock(&fsvq->lock);
> >>>>>  }
> >>>>>  
> >>>>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
> >>>>>  {
> >>>>>  	struct fuse_req *req;
> >>>>>  	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
> >>>>> -						 dispatch_work.work);
> >>>>> +						 dispatch_work);
> >>>>>  	int ret;
> >>>>>  
> >>>>>  	pr_debug("virtio-fs: worker %s called.\n", __func__);
> >>>>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
> >>>>>  			if (ret == -ENOMEM || ret == -ENOSPC) {
> >>>>>  				spin_lock(&fsvq->lock);
> >>>>>  				list_add_tail(&req->list, &fsvq->queued_reqs);
> >>>>> -				schedule_delayed_work(&fsvq->dispatch_work,
> >>>>> -						      msecs_to_jiffies(1));
> >>>>
> >>>> Virtqueue being full is only one of the reasons for failure to queue
> >>>> the request. What if virtqueue is empty but we could not queue the
> >>>> request because lack of memory (-ENOMEM). In that case we will queue
> >>>> the request and it might not be dispatched because there is no completion.
> >>>> (Assume there is no further new request coming). That means deadlock?
> >>>>
> >>>> Thanks
> >>>> Vivek
> >>>>
> >>>
> >>> Good catch that will deadlock.
> >>>
> >>> Is default kernel behavior to indefinitely retry a file system
> >>> request until memory is available?
> >>
> >> As of now that seems to be the behavior. I think I had copied this
> >> code from another driver. 
> >>
> >> But I guess one can argue that if memory is not available, then
> >> return -ENOMEM to user space instead of retrying in kernel.
> >>
> >> Stefan, Miklos, WDYT?
> > 
> > My understanding is that file system syscalls may return ENOMEM, so this
> > is okay.
> > 
> > Stefan
> 
> Then I propose only handling -ENOSPC as a special case and letting all
> other errors go through to userspace.
> 
> Noob Linux contributor question: how often should I send in a new revision of
> the patch? Should I wait for more comments or send in a V3 with that fix now?

You can send v3. If it were a long patch series with multiple
maintainers then maybe allowing the others additional time for review
would avoid churn, but in this case I think it's fine to send the next
revision.

Stefan
Attachment:
signature.asc

Description: PGP signature
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization