Re: Large CQE for fuse headers

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Wed, 16 Oct 2024 13:53:00 +0200

On 10/16/24 12:54, Miklos Szeredi wrote:
> On Mon, 14 Oct 2024 at 23:27, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
> 
>> With only libfuse as ring user it is more like
>>
>> prep_requests(nr=N);
>> wait_cq(1); ==> we must not wait for more than 1 as more might never arrive
>> io_uring_for_each_cqe {
>> }
> 
> Right.
> 
> I think the point Pavel is trying to make is that  io_uring queue
> sizes don't have to match fuse queue size.  So we could have
> sq_entries=4, cq_entries=4 and have the server queue 64
> FUSE_URING_REQ_FETCH commands, it just has to do that in batches of 4
> max.

Hmm ok, I guess that might matter when payload is small compared to 
SQ/CQ size and the system is low in memory.

> 
>> @Miklos maybe we avoid using large CQEs/SQEs and instead set up our own
>> separate buffer for FUSE headers?
> 
> The only gain from this would be in the case where the uring is used
> for non-fuse requests as well, in which case the extra space in the
> queue entries would be unused (i.e. 48 unused bytes in the cacheline).
> I don't know if this is a realistic use case or not.  It's definitely
> a challenge to create a library API that allows this.
> 
> The disadvantage would be a more complex interface.

I don't think that complicated. In the end it is just another pointer
that needs to be mapped. We don't even need to use mmap.
At least for zero-copy we will need to the ring non-fuse requests. 
For the DDN use case, we are using another io-uring for tcp requests,
I would actually like to switch that to the same ring.

Thanks,
Bernd