On 11/19/24 8:30 AM, Gabriel Krisman Bertazi wrote: > Jens Axboe <axboe@xxxxxxxxx> writes: > >> On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote: >>> diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h >>> index b7a38a2069cf..6b34e491a30a 100644 >>> --- a/io_uring/alloc_cache.h >>> +++ b/io_uring/alloc_cache.h >>> @@ -30,6 +30,13 @@ static inline void *io_alloc_cache_get(struct io_alloc_cache *cache) >>> return NULL; >>> } >>> >>> +static inline void *io_alloc_cache_alloc(struct io_alloc_cache *cache, gfp_t gfp) >>> +{ >>> + if (!cache->nr_cached) >>> + return kzalloc(cache->elem_size, gfp); >>> + return io_alloc_cache_get(cache); >>> +} >> >> I don't think you want to use kzalloc here. The caller will need to >> clear what its needs for the cached path anyway, so has no other option >> than to clear/set things twice for that case. > > Hi Jens, > > The reason I do kzalloc here is to be able to trust the value of > rw->free_iov (io_rw_alloc_async) and hdr->free_iov (io_msg_alloc_async) > regardless of where the allocated memory came from, cache or slab. In > the callers (patch 6 and 7), we do: I see, I guess that makes sense as some things are persistent in cache and need clearing upfront if freshly allocated. > + hdr = io_uring_alloc_async_data(&ctx->netmsg_cache, req); > + if (!hdr) > + return NULL; > + > + /* If the async data was cached, we might have an iov cached inside. */ > + if (hdr->free_iov) { > > An alternative would be to return a flag indicating whether the > allocated memory came from the cache or not, but it didn't seem elegant. > Do you see a better way? > > I also considered that zeroing memory here shouldn't harm performance, > because it'll hit the cache most of the time. It should hit cache most of the time, but if we exceed the cache size, then you will see allocations happen and churn. I don't like the idea of the flag, then we still need to complicate the caller. We can do something like slab where you have a hook for freshly allocated data only? That can either be a property of the cache, or passed in via io_alloc_cache_alloc()? BTW, I'd probably change the name of that to io_cache_get() or io_cache_alloc() or something like that, I don't think we need two allocs in there. -- Jens Axboe