Re: Large CQE for fuse headers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/11/24 12:35 PM, Bernd Schubert wrote:
> On 10/11/24 19:57, Jens Axboe wrote:
>> On 10/10/24 2:56 PM, Bernd Schubert wrote:
>>> Hello,
>>>
>>> as discussed during LPC, we would like to have large CQE sizes, at least
>>> 256B. Ideally 256B for fuse, but CQE512 might be a bit too much...
>>>
>>> Pavel said that this should be ok, but it would be better to have the CQE
>>> size as function argument. 
>>> Could you give me some hints how this should look like and especially how
>>> we are going to communicate the CQE size to the kernel? I guess just adding
>>> IORING_SETUP_CQE256 / IORING_SETUP_CQE512 would be much easier.
>>
>> Not Pavel and unfortunately I could not be at that LPC discussion, but
>> yeah I don't see why not just adding the necessary SETUP arg for this
>> would not be the way to go. As long as they are power-of-2, then all
>> it'll impact on both the kernel and liburing side is what size shift to
>> use when iterating CQEs.
> 
> Thanks, Pavel also wanted power-of-2, although 512 is a bit much for fuse. 
> Well, maybe 256 will be sufficient. Going to look into adding that parameter
> during the next days.

We really have to keep it pow-of-2 just to avoid convoluting the logic
(and overhead) of iterating the CQ ring and CQEs. You can search for
IORING_SETUP_CQE32 in the kernel to see how it's just a shift, and ditto
on the liburing side.

Curious, what's all the space needed for?

>> Since this obviously means larger CQ rings, one nice side effect is that
>> since 6.10 we don't need contig pages to map any of the rings. So should
>> work just fine regardless of memory fragmentation, where previously that
>> would've been a concern.
>>
> 
> Out of interest, what is the change? Up to fuse-io-uring rfc2 I was
> vmalloced buffers for fuse that got mmaped - was working fine. Miklos just
> wants to avoid that kernel allocates large chunks of memory on behalf of
> users.

It was the change that got rid of remap_pfn_range() for mapping, and
switched to vm_insert_page(s) instead. Memory overhead should generally
not be too bad, it's all about sizing the rings appropriately. The much
bigger concern is needing contig memory, as that can become scarce after
longer uptimes, even with plenty of memory free. This is particularly
important if you need 512b CQEs, obviously.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux