On Thu, Jan 23, 2025 at 10:40 AM Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > > > > On 1/23/25 19:32, Joanne Koong wrote: > > On Thu, Jan 23, 2025 at 10:06 AM Bernd Schubert > > <bernd.schubert@xxxxxxxxxxx> wrote: > >> > >> > >> > >> On 1/23/25 18:48, Joanne Koong wrote: > >>> On Thu, Jan 23, 2025 at 9:19 AM Bernd Schubert > >>> <bernd.schubert@xxxxxxxxxxx> wrote: > >>>> > >>>> Hi Joanne, > >>>> > >>>>>>> Thanks, applied and pushed with some cleanups including Luis's clamp idea. > >>>>>> > >>>>>> Hi Miklos, > >>>>>> > >>>>>> I don't think the timeouts do work with io-uring yet, I'm not sure > >>>>>> yet if I have time to work on that today or tomorrow (on something > >>>>>> else right now, I can try, but no promises). > >>>>> > >>>>> Hi Bernd, > >>>>> > >>>>> What are your thoughts on what is missing on the io-uring side for > >>>>> timeouts? If a request times out, it will abort the connection and > >>>>> AFAICT, the abort logic should already be fine for io-uring, as users > >>>>> can currently abort the connection through the sysfs interface and > >>>>> there's no internal difference in aborting through sysfs vs timeouts. > >>>>> > >>>> > >>>> in fuse_check_timeout() it iterates over each fud and then fpq. > >>>> In dev_uring.c fpq is is per queue but unrelated to fud. In current > >>>> fuse-io-uring fud is not cloned anymore - using fud won't work. > >>>> And Requests are also not queued at all on the other list > >>>> fuse_check_timeout() is currently checking. > >>> > >>> In the io-uring case, there still can be fuds and their associated > >>> fpqs given that /dev/fuse can be used still, no? So wouldn't the > >>> io-uring case still need this logic in fuse_check_timeout() for > >>> checking requests going through /dev/fuse? > >> > >> Yes, these need to be additionally checked. > >> > >>> > >>>> > >>>> Also, with a ring per core, maybe better to use > >>>> a per queue check that is core bound? I.e. zero locking overhead? > >>> > >>> How do you envision a queue check that bypasses grabbing the > >>> queue->lock? The timeout handler could be triggered on any core, so > >>> I'm not seeing how it could be core bound. > >> > >> I don't want to bypass it, but maybe each queue could have its own > >> workq and timeout checker? And then use queue_delayed_work_on()? > >> > >> > >>> > >>>> And I think we can also avoid iterating over hash lists (queue->fpq), > >>>> but can use the 'ent_in_userspace' list. > >>>> > >>>> We need to iterate over these other entry queues anyway: > >>>> > >>>> ent_w_req_queue > >>>> fuse_req_bg_queue > >>>> ent_commit_queue > >>>> > >>> > >>> Why do we need to iterate through the ent lists (ent_w_req_queue and > >>> ent_commit_queue)? AFAICT, in io-uring a request is either on the > >>> fuse_req_queue/fuse_req_bg_queue or on the fpq->processing list. Even > >>> when an entry has been queued to ent_w_req_queue or ent_commit_queue, > >>> the request itself is still queued on > >>> fuse_req_queue/fuse_req_bg_queue/fpq->processing. I'm not sure I > >>> understand why we still need to look at the ent lists? > >> > >> Yeah you are right, we could avoid ent_w_req_queue and ent_commit_queue > >> if we use fpq->processing, but processing consists of 256 lists - > >> overhead is much smaller by using the entry lists? > >> > >> > >>> > >>>> > >>>> And we also need to iterate over > >>>> > >>>> fuse_req_queue > >>>> fuse_req_bg_queue > >>> > >>> Why do we need to iterate through fuse_req_queue and > >>> fuse_req_bg_queue? fuse_uring_request_expired() checks the head of > >>> fuse_req_queue and fuse_req_bg_queue and given that requests are added > >>> to fuse_req_queue/fuse_req_bg_queue sequentially (eg added to the tail > >>> of these lists), why isn't this enough? > >> > >> I admit I'm a bit lost with that question. Aren't you pointing out > >> the same lists as I do? > >> > > > > Oh, I thought your comment was saying that we need to "iterate" over > > it (eg go through every request on the lists)? It currently already > > checks the fuse_req_queue and fuse_req_bg_queue lists (in > > fuse_uring_request_expired() which gets invoked in the > > fuse_check_timeout() timeout handler). > > > > Maybe the fuse_uring_request_expired() addition got missed - I tried > > to call this out in the cover letter changelog, as I had to rebase > > this patchset on top of the io-uring patches, so I added this function > > in to make it work for io-uring. I believe this suffices for now for > > the io uring case (with future optimizations that can be added)? > > > Ah sorry, that is me, I had missed you had already rebased it to > io-uring. > > So we are good to land this version. > Just would be good if we could optimize this soon - our test systems > have 96 cores - 24576 list heads to check... I won't manage to work > on it today and probably also not tomorrow, but by Monday I should > have an optimized version ready. > Oh wow, 96 cores!! I'll submit an optimized version of this today or tomorrow then, if that makes things easier for you. Thanks, Joanne > Thanks, > Bernd >