Re: [PATCHSET 0/4] Allow allocated direct descriptors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/9/22 11:35 PM, Hao Xu wrote:
> ? 2022/5/9 ??10:49, Jens Axboe ??:
>> On 5/9/22 7:20 AM, Hao Xu wrote:
>>> ? 2022/5/9 ??7:49, Jens Axboe ??:
>>>> Hi,
>>>>
>>>> Currently using direct descriptors with open or accept requires the
>>>> application to manage the descriptor space, picking which slot to use
>>>> for any given file. However, there are cases where it's useful to just
>>>> get a direct descriptor and not care about which value it is, instead
>>>> just return it like a normal open or accept would.
>>>>
>>>> This will also be useful for multishot accept support, where allocated
>>>> direct descriptors are a requirement to make that feature work with
>>>> these kinds of files.
>>>>
>>>> This adds support for allocating a new fixed descriptor. This is chosen
>>>> by passing in UINT_MAX as the fixed slot, which otherwise has a limit
>>>> of INT_MAX like any file descriptor does.
>>>>
>>>>    fs/io_uring.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++---
>>>>     1 file changed, 94 insertions(+), 6 deletions(-)
>>>>
>>> Hi Jens,
>>> I've read this idea of leveraging bitmap, it looks great. a small flaw
>>> of it is that when the file_table is very long, the bitmap searching
>>> seems to be O({length of table}/BITS_PER_LONG), to make the time
>>> complexity stable, I did a linked list version, could you have a look
>>> when you're avalible. totally untested, just to show my idea. Basically
>>> I use a list to link all the free slots, when we need a slot, just get
>>> the head of it.
>>> https://github.com/HowHsu/linux/commits/for-5.19/io_uring_multishot_accept_v5
>>>
>>> (borrowed some commit message from your patches)
>>
>> While that's certainly true, I'm skeptical that the list management will
>> be faster for most cases. It's worth nothing that the regular file
>> allocator is very much the same thing. A full scan is unlikely unless
>> you already got -ENFILE. Any clear in between will reset the hint and
>> it'll be O(1) again. So yes, the pathological case of having no
> 
> it's not O(1) actually, and a full bitmap is not the only worst case.
> For instance, the bitmap is like:
>                              hint
>                               |
>    1111111111111111111111111110000
> 
> then a bit is cleared and hint is updated:
>      hint
>       |
>    1110111111111111111111111110000
> 
> then next time the complexity is high

Next time it's fine, since the hint is that bit. If you do do, then yes
the second would be a slower.

> So in this kind of scenario(first allocate many in order, then clear
> low bit and allocation goes on in turn), it would be slow. And I think
> these cases are not rare since people usually allocate many fds then
> free the early used fds from time to time.

It's by no means perfect, but if it's good enough for the normal file
allocator, then I don't think it'd be wise to over-engineer this one
until there's a proven need to do so.

The single list items tracking free items is most certainly a LOT slower
for the common cases, so I don't think that's a good approach at all.

My suggestion would be to stick with the proposed approach until there's
evidence that the allocator needs improving. I did write a benchmark
that uses a 500K map and does opens and closes, and I don't see anything
to worry about in terms of overhead. The bitmap handling doesn't even
really register, dwarfed by the rest of the open path.

>> If the case of finding a new descriptor is slow for a mostly full space,
>> in the past I've done something like axmap [1] in fio, where you each
>> 64-bit entry is representing by a single bit a layer up. That still has
>> very good space utilization and good cache layout, which the list very
>> much does not. But given the above, I don't think we need to worry about
>> that really.
>>
>> As a side note, I do think we need to just bump the size of the max
>> direct descriptors we can have. With the file table potentially being
>> vmalloc backed, there's no reason to limit it to the current 32K.
> 
> Agree.
> 
>>
>> [1] https://git.kernel.dk/cgit/fio/tree/lib/axmap.c
>>
> Cool, I'll have a look.

It can get boiled down to something a bit simpler as the fio
implementation supports a variety of different use cases. For example, I
think it should be implemented as a single indexed array that holds all
the levels, rather than separate is it's done there. In short, it just
condenses everything down to one qword eventually, and finding a free
bit is always O(log64(N)).

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux