Pavel Begunkov <asml.silence@xxxxxxxxx> writes: > On 3/31/23 15:09, Gabriel Krisman Bertazi wrote: >> Pavel Begunkov <asml.silence@xxxxxxxxx> writes: >> >>> Add allocation cache for struct io_rsrc_node, it's always allocated and >>> put under ->uring_lock, so it doesn't need any extra synchronisation >>> around caches. >> Hi Pavel, >> I'm curious if you considered using kmem_cache instead of the custom >> cache for this case? I'm wondering if this provokes visible difference in >> performance in your benchmark. > > I didn't try it, but kmem_cache vs kmalloc, IIRC, doesn't bring us > much, definitely doesn't spare from locking, and the overhead > definitely wasn't satisfactory for requests before. There is no locks in the fast path of slub, as far as I know. it has a per-cpu cache that is refilled once empty, quite similar to the fastpath of this cache. I imagine the performance hit in slub comes from the barrier and atomic operations? kmem_cache works fine for most hot paths of the kernel. I think this custom cache makes sense for the request cache, where objects are allocated at an incredibly high rate. but is this level of update frequency a valid use case here? If it is indeed a significant performance improvement, I guess it is fine to have another user of the cache. But I'd be curious to know how much of the performance improvement you mentioned in the cover letter is due to this patch! -- Gabriel Krisman Bertazi