Re: fuse-io-uring: We need to keep the tag/index

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Thu, 3 Oct 2024 15:56:34 +0200

On 10/3/24 15:19, Bernd Schubert wrote:
> 
> 
> On 10/3/24 14:02, Miklos Szeredi wrote:
>> On Thu, 3 Oct 2024 at 12:10, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
>>
>>> What I mean is that you wanted to get rid of the 'tag' - using any kind of
>>> search means we still need it. I.e. we cannot just take last list head
>>> or tail and use that.
>>> The array is only dynamic at initialization time. And why spending O(logN)
>>> to search instead of O(1)?
>>
>> Because for sane queue depths they are essentially the same.  This is
>> not where we can gain or lose any significant performance.
>>
>>> And I know that it is an implementation detail, I just would like to avoid
>>> many rebasing rounds on these details.
>>
>> I think the logical interface would be:
>>
>>  - pass a userspace buffer to FETCH (you told me, but I don't remember
>> why sqe->addr isn't suitable)
> 
> I think we could change to that now. 
> 
>>
>>  - set sqe->user_data to an implementation dependent value, this could
>> be just the userspace buffer, but it could be a request object
> 
> Libfuse already does this, it points to 'struct fuse_ring_ent', which then
> points to the buffer. Maybe that could be optimized to have 
> 'struct fuse_ring_ent' as part of the buffer.
> 
>>
>>  - kernel allocates an idle request and queues it.
>>
>>  - request comes in, kernel takes a request from the idle queue and fills it
>>
>>  - cqe->user_data is returned with the original sqe->user_data, which
>> should be sufficient for the server to identify the request
>>
>>  - process request, send COMMIT_AND_FETCH with the userspace buffer
>> and user data
>>
>>  - the kernel reads the header from the userspace buffer, finds
>> outh->unique, finds and completes the request
>>
>>  - then queues the request on the idle queue
>>
>> ...
>>
>> What's wrong with that?
> 
> In my opinion a bit sad that we have to search
> instead of just doing an array[ID] access. I certainly don't want to
> rely on the current hashed list search, this only works reasonably
> well, because with the current threading model requests in fly is
> typically small. 
> And then rb entries also have pointers - it won't take 
> any less memory, more on the contrary. 
> 
> At best one could argue that on tear down races are avoided as one
> has to do a search now. Although that was handled but checks
> tear-down checks in fuse_uring_cmd.
> 
> Well, if you really prefer, I'm going to add use an rbtree or maybe
> better xarray search.  No reason for anything like that in libfuse, it can
> just continue to type cast 'user_data'.

I'm inclined to do xarray, but still to assign an index (in the order of
received FUSE_URING_REQ_FETCH per queue) - should still give O(1).

Thanks,
Bernd