Re: [PATCH v3 1/2] io_uring: Move from hlist to io_wq_work_node

Gabriel Krisman Bertazi <krisman@xxxxxxx> · Fri, 24 Feb 2023 15:32:47 -0300

Jens Axboe <axboe@xxxxxxxxx> writes:

> On 2/23/23 12:02?PM, Gabriel Krisman Bertazi wrote:
>> Breno Leitao <leitao@xxxxxxxxxx> writes:
>> 
>>> Having cache entries linked using the hlist format brings no benefit, and
>>> also requires an unnecessary extra pointer address per cache entry.
>>>
>>> Use the internal io_wq_work_node single-linked list for the internal
>>> alloc caches (async_msghdr and async_poll)
>>>
>>> This is required to be able to use KASAN on cache entries, since we do
>>> not need to touch unused (and poisoned) cache entries when adding more
>>> entries to the list.
>>>
>> 
>> Looking at this patch, I wonder if it could go in the opposite direction
>> instead, and drop io_wq_work_node entirely in favor of list_head. :)
>> 
>> Do we gain anything other than avoiding the backpointer with a custom
>> linked implementation, instead of using the interface available in
>> list.h, that developers know how to use and has other features like
>> poisoning and extra debug checks?
>
> list_head is twice as big, that's the main motivation. This impacts
> memory usage (obviously), but also caches when adding/removing
> entries.

Right. But this is true all around the kernel.  Many (Most?)  places
that use list_head don't even need to touch list_head->prev.  And
list_head is usually embedded in larger structures where the cost of
the extra pointer is insignificant.  I suspect the memory
footprint shouldn't really be the problem.

This specific patch is extending io_wq_work_node to io_cache_entry,
where the increased size will not matter.  In fact, for the cached
structures, the cache layout and memory footprint don't even seem to
change, as io_cache_entry is already in a union larger than itself, that
is not crossing cachelines, (io_async_msghdr, async_poll).

The other structures currently embedding struct io_work_node are
io_kiocb (216 bytes long, per request) and io_ring_ctx (1472 bytes long,
per ring). so it is not like we are saving a lot of memory with a single
linked list. A more compact cache line still makes sense, though, but I
think the only case (if any) where there might be any gain is io_kiocb?

I don't severely oppose this patch, of course. But I think it'd be worth
killing io_uring/slist.h entirely in the future instead of adding more
users.  I intend to give that approach a try, if there's a way to keep
the size of io_kiocb.

-- 
Gabriel Krisman Bertazi