Re: [PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 31, 2023 at 04:49:39PM -0400, Vivek Goyal wrote:
> On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen wrote:
> > On 31/05/2023 21:18, Vivek Goyal wrote:
> > > On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
> > >> When the Virtio queue is full, a work item is scheduled
> > >> to execute in 1ms that retries adding the request to the queue.
> > >> This is a large amount of time on the scale on which a
> > >> virtio-fs device can operate. When using a DPU this is around
> > >> 40us baseline without going to a remote server (4k, QD=1).
> > >> This patch queues requests when the Virtio queue is full,
> > >> and when a completed request is taken off, immediately fills
> > >> it back up with queued requests.
> > >>
> > >> This reduces the 99.9th percentile latencies in our tests by
> > >> 60x and slightly increases the overall throughput, when using a
> > >> queue depth 2x the size of the Virtio queue size, with a
> > >> DPU-powered virtio-fs device.
> > >>
> > >> Signed-off-by: Peter-Jan Gootzen <peter-jan@xxxxxxxxxxx>
> > >> ---
> > >> V1 -> V2: Not scheduling dispatch work anymore when not needed
> > >> and changed delayed_work structs to work_struct structs
> > >>
> > >>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
> > >>  1 file changed, 17 insertions(+), 15 deletions(-)
> > >>
> > >> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> > >> index 4d8d4f16c727..a676297db09b 100644
> > >> --- a/fs/fuse/virtio_fs.c
> > >> +++ b/fs/fuse/virtio_fs.c
> > >> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
> > >>  	struct work_struct done_work;
> > >>  	struct list_head queued_reqs;
> > >>  	struct list_head end_reqs;	/* End these requests */
> > >> -	struct delayed_work dispatch_work;
> > >> +	struct work_struct dispatch_work;
> > >>  	struct fuse_dev *fud;
> > >>  	bool connected;
> > >>  	long in_flight;
> > >> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq *fsvq)
> > >>  	}
> > >>  
> > >>  	flush_work(&fsvq->done_work);
> > >> -	flush_delayed_work(&fsvq->dispatch_work);
> > >> +	flush_work(&fsvq->dispatch_work);
> > >>  }
> > >>  
> > >>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
> > >> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work)
> > >>  			dec_in_flight_req(fsvq);
> > >>  		}
> > >>  	} while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
> > >> +
> > >> +	if (!list_empty(&fsvq->queued_reqs))
> > >> +		schedule_work(&fsvq->dispatch_work);
> > >>  	spin_unlock(&fsvq->lock);
> > >>  }
> > >>  
> > >> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
> > >>  {
> > >>  	struct fuse_req *req;
> > >>  	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
> > >> -						 dispatch_work.work);
> > >> +						 dispatch_work);
> > >>  	int ret;
> > >>  
> > >>  	pr_debug("virtio-fs: worker %s called.\n", __func__);
> > >> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
> > >>  			if (ret == -ENOMEM || ret == -ENOSPC) {
> > >>  				spin_lock(&fsvq->lock);
> > >>  				list_add_tail(&req->list, &fsvq->queued_reqs);
> > >> -				schedule_delayed_work(&fsvq->dispatch_work,
> > >> -						      msecs_to_jiffies(1));
> > > 
> > > Virtqueue being full is only one of the reasons for failure to queue
> > > the request. What if virtqueue is empty but we could not queue the
> > > request because lack of memory (-ENOMEM). In that case we will queue
> > > the request and it might not be dispatched because there is no completion.
> > > (Assume there is no further new request coming). That means deadlock?
> > > 
> > > Thanks
> > > Vivek
> > > 
> > 
> > Good catch that will deadlock.
> > 
> > Is default kernel behavior to indefinitely retry a file system
> > request until memory is available?
> 
> As of now that seems to be the behavior. I think I had copied this
> code from another driver. 
> 
> But I guess one can argue that if memory is not available, then
> return -ENOMEM to user space instead of retrying in kernel.
> 
> Stefan, Miklos, WDYT?

My understanding is that file system syscalls may return ENOMEM, so this
is okay.

Stefan

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux