Jens Axboe <axboe@xxxxxxxxx> writes: > On 2/23/23 12:02?PM, Gabriel Krisman Bertazi wrote: >> Breno Leitao <leitao@xxxxxxxxxx> writes: >> >>> Having cache entries linked using the hlist format brings no benefit, and >>> also requires an unnecessary extra pointer address per cache entry. >>> >>> Use the internal io_wq_work_node single-linked list for the internal >>> alloc caches (async_msghdr and async_poll) >>> >>> This is required to be able to use KASAN on cache entries, since we do >>> not need to touch unused (and poisoned) cache entries when adding more >>> entries to the list. >>> >> >> Looking at this patch, I wonder if it could go in the opposite direction >> instead, and drop io_wq_work_node entirely in favor of list_head. :) >> >> Do we gain anything other than avoiding the backpointer with a custom >> linked implementation, instead of using the interface available in >> list.h, that developers know how to use and has other features like >> poisoning and extra debug checks? > > list_head is twice as big, that's the main motivation. This impacts > memory usage (obviously), but also caches when adding/removing > entries. Right. But this is true all around the kernel. Many (Most?) places that use list_head don't even need to touch list_head->prev. And list_head is usually embedded in larger structures where the cost of the extra pointer is insignificant. I suspect the memory footprint shouldn't really be the problem. This specific patch is extending io_wq_work_node to io_cache_entry, where the increased size will not matter. In fact, for the cached structures, the cache layout and memory footprint don't even seem to change, as io_cache_entry is already in a union larger than itself, that is not crossing cachelines, (io_async_msghdr, async_poll). The other structures currently embedding struct io_work_node are io_kiocb (216 bytes long, per request) and io_ring_ctx (1472 bytes long, per ring). so it is not like we are saving a lot of memory with a single linked list. A more compact cache line still makes sense, though, but I think the only case (if any) where there might be any gain is io_kiocb? I don't severely oppose this patch, of course. But I think it'd be worth killing io_uring/slist.h entirely in the future instead of adding more users. I intend to give that approach a try, if there's a way to keep the size of io_kiocb. -- Gabriel Krisman Bertazi