On Thu, Jun 01, 2023 at 04:49:06PM +0200, Peter-Jan Gootzen wrote: > On 01/06/2023 16:08, Stefan Hajnoczi wrote: > > On Wed, May 31, 2023 at 04:49:39PM -0400, Vivek Goyal wrote: > >> On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen wrote: > >>> On 31/05/2023 21:18, Vivek Goyal wrote: > >>>> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote: > >>>>> When the Virtio queue is full, a work item is scheduled > >>>>> to execute in 1ms that retries adding the request to the queue. > >>>>> This is a large amount of time on the scale on which a > >>>>> virtio-fs device can operate. When using a DPU this is around > >>>>> 40us baseline without going to a remote server (4k, QD=1). > >>>>> This patch queues requests when the Virtio queue is full, > >>>>> and when a completed request is taken off, immediately fills > >>>>> it back up with queued requests. > >>>>> > >>>>> This reduces the 99.9th percentile latencies in our tests by > >>>>> 60x and slightly increases the overall throughput, when using a > >>>>> queue depth 2x the size of the Virtio queue size, with a > >>>>> DPU-powered virtio-fs device. > >>>>> > >>>>> Signed-off-by: Peter-Jan Gootzen <peter-jan@xxxxxxxxxxx> > >>>>> --- > >>>>> V1 -> V2: Not scheduling dispatch work anymore when not needed > >>>>> and changed delayed_work structs to work_struct structs > >>>>> > >>>>> fs/fuse/virtio_fs.c | 32 +++++++++++++++++--------------- > >>>>> 1 file changed, 17 insertions(+), 15 deletions(-) > >>>>> > >>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > >>>>> index 4d8d4f16c727..a676297db09b 100644 > >>>>> --- a/fs/fuse/virtio_fs.c > >>>>> +++ b/fs/fuse/virtio_fs.c > >>>>> @@ -45,7 +45,7 @@ struct virtio_fs_vq { > >>>>> struct work_struct done_work; > >>>>> struct list_head queued_reqs; > >>>>> struct list_head end_reqs; /* End these requests */ > >>>>> - struct delayed_work dispatch_work; > >>>>> + struct work_struct dispatch_work; > >>>>> struct fuse_dev *fud; > >>>>> bool connected; > >>>>> long in_flight; > >>>>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq *fsvq) > >>>>> } > >>>>> > >>>>> flush_work(&fsvq->done_work); > >>>>> - flush_delayed_work(&fsvq->dispatch_work); > >>>>> + flush_work(&fsvq->dispatch_work); > >>>>> } > >>>>> > >>>>> static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs) > >>>>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work) > >>>>> dec_in_flight_req(fsvq); > >>>>> } > >>>>> } while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq))); > >>>>> + > >>>>> + if (!list_empty(&fsvq->queued_reqs)) > >>>>> + schedule_work(&fsvq->dispatch_work); > >>>>> spin_unlock(&fsvq->lock); > >>>>> } > >>>>> > >>>>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work) > >>>>> { > >>>>> struct fuse_req *req; > >>>>> struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > >>>>> - dispatch_work.work); > >>>>> + dispatch_work); > >>>>> int ret; > >>>>> > >>>>> pr_debug("virtio-fs: worker %s called.\n", __func__); > >>>>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work) > >>>>> if (ret == -ENOMEM || ret == -ENOSPC) { > >>>>> spin_lock(&fsvq->lock); > >>>>> list_add_tail(&req->list, &fsvq->queued_reqs); > >>>>> - schedule_delayed_work(&fsvq->dispatch_work, > >>>>> - msecs_to_jiffies(1)); > >>>> > >>>> Virtqueue being full is only one of the reasons for failure to queue > >>>> the request. What if virtqueue is empty but we could not queue the > >>>> request because lack of memory (-ENOMEM). In that case we will queue > >>>> the request and it might not be dispatched because there is no completion. > >>>> (Assume there is no further new request coming). That means deadlock? > >>>> > >>>> Thanks > >>>> Vivek > >>>> > >>> > >>> Good catch that will deadlock. > >>> > >>> Is default kernel behavior to indefinitely retry a file system > >>> request until memory is available? > >> > >> As of now that seems to be the behavior. I think I had copied this > >> code from another driver. > >> > >> But I guess one can argue that if memory is not available, then > >> return -ENOMEM to user space instead of retrying in kernel. > >> > >> Stefan, Miklos, WDYT? > > > > My understanding is that file system syscalls may return ENOMEM, so this > > is okay. > > > > Stefan > > Then I propose only handling -ENOSPC as a special case and letting all > other errors go through to userspace. > > Noob Linux contributor question: how often should I send in a new revision of > the patch? Should I wait for more comments or send in a V3 with that fix now? You can send v3. If it were a long patch series with multiple maintainers then maybe allowing the others additional time for review would avoid churn, but in this case I think it's fine to send the next revision. Stefan
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization