Re: [PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

Peter-Jan Gootzen via Virtualization <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx> · Wed, 31 May 2023 22:34:15 +0200

On 31/05/2023 21:18, Vivek Goyal wrote:
> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
>> When the Virtio queue is full, a work item is scheduled
>> to execute in 1ms that retries adding the request to the queue.
>> This is a large amount of time on the scale on which a
>> virtio-fs device can operate. When using a DPU this is around
>> 40us baseline without going to a remote server (4k, QD=1).
>> This patch queues requests when the Virtio queue is full,
>> and when a completed request is taken off, immediately fills
>> it back up with queued requests.
>>
>> This reduces the 99.9th percentile latencies in our tests by
>> 60x and slightly increases the overall throughput, when using a
>> queue depth 2x the size of the Virtio queue size, with a
>> DPU-powered virtio-fs device.
>>
>> Signed-off-by: Peter-Jan Gootzen <peter-jan@xxxxxxxxxxx>
>> ---
>> V1 -> V2: Not scheduling dispatch work anymore when not needed
>> and changed delayed_work structs to work_struct structs
>>
>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>> index 4d8d4f16c727..a676297db09b 100644
>> --- a/fs/fuse/virtio_fs.c
>> +++ b/fs/fuse/virtio_fs.c
>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
>>  	struct work_struct done_work;
>>  	struct list_head queued_reqs;
>>  	struct list_head end_reqs;	/* End these requests */
>> -	struct delayed_work dispatch_work;
>> +	struct work_struct dispatch_work;
>>  	struct fuse_dev *fud;
>>  	bool connected;
>>  	long in_flight;
>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq *fsvq)
>>  	}
>>  
>>  	flush_work(&fsvq->done_work);
>> -	flush_delayed_work(&fsvq->dispatch_work);
>> +	flush_work(&fsvq->dispatch_work);
>>  }
>>  
>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work)
>>  			dec_in_flight_req(fsvq);
>>  		}
>>  	} while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
>> +
>> +	if (!list_empty(&fsvq->queued_reqs))
>> +		schedule_work(&fsvq->dispatch_work);
>>  	spin_unlock(&fsvq->lock);
>>  }
>>  
>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
>>  {
>>  	struct fuse_req *req;
>>  	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
>> -						 dispatch_work.work);
>> +						 dispatch_work);
>>  	int ret;
>>  
>>  	pr_debug("virtio-fs: worker %s called.\n", __func__);
>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
>>  			if (ret == -ENOMEM || ret == -ENOSPC) {
>>  				spin_lock(&fsvq->lock);
>>  				list_add_tail(&req->list, &fsvq->queued_reqs);
>> -				schedule_delayed_work(&fsvq->dispatch_work,
>> -						      msecs_to_jiffies(1));
> 
> Virtqueue being full is only one of the reasons for failure to queue
> the request. What if virtqueue is empty but we could not queue the
> request because lack of memory (-ENOMEM). In that case we will queue
> the request and it might not be dispatched because there is no completion.
> (Assume there is no further new request coming). That means deadlock?
> 
> Thanks
> Vivek
> 

Good catch that will deadlock.

Is default kernel behavior to indefinitely retry a file system
request until memory is available?

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization