On 10/3/24 15:19, Bernd Schubert wrote: > > > On 10/3/24 14:02, Miklos Szeredi wrote: >> On Thu, 3 Oct 2024 at 12:10, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: >> >>> What I mean is that you wanted to get rid of the 'tag' - using any kind of >>> search means we still need it. I.e. we cannot just take last list head >>> or tail and use that. >>> The array is only dynamic at initialization time. And why spending O(logN) >>> to search instead of O(1)? >> >> Because for sane queue depths they are essentially the same. This is >> not where we can gain or lose any significant performance. >> >>> And I know that it is an implementation detail, I just would like to avoid >>> many rebasing rounds on these details. >> >> I think the logical interface would be: >> >> - pass a userspace buffer to FETCH (you told me, but I don't remember >> why sqe->addr isn't suitable) > > I think we could change to that now. > >> >> - set sqe->user_data to an implementation dependent value, this could >> be just the userspace buffer, but it could be a request object > > Libfuse already does this, it points to 'struct fuse_ring_ent', which then > points to the buffer. Maybe that could be optimized to have > 'struct fuse_ring_ent' as part of the buffer. > >> >> - kernel allocates an idle request and queues it. >> >> - request comes in, kernel takes a request from the idle queue and fills it >> >> - cqe->user_data is returned with the original sqe->user_data, which >> should be sufficient for the server to identify the request >> >> - process request, send COMMIT_AND_FETCH with the userspace buffer >> and user data >> >> - the kernel reads the header from the userspace buffer, finds >> outh->unique, finds and completes the request >> >> - then queues the request on the idle queue >> >> ... >> >> What's wrong with that? > > In my opinion a bit sad that we have to search > instead of just doing an array[ID] access. I certainly don't want to > rely on the current hashed list search, this only works reasonably > well, because with the current threading model requests in fly is > typically small. > And then rb entries also have pointers - it won't take > any less memory, more on the contrary. > > At best one could argue that on tear down races are avoided as one > has to do a search now. Although that was handled but checks > tear-down checks in fuse_uring_cmd. > > Well, if you really prefer, I'm going to add use an rbtree or maybe > better xarray search. No reason for anything like that in libfuse, it can > just continue to type cast 'user_data'. I'm inclined to do xarray, but still to assign an index (in the order of received FUSE_URING_REQ_FETCH per queue) - should still give O(1). Thanks, Bernd