Re: [PATCH 10/11] io_uring/rsrc: cache struct io_rsrc_node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/1/23 01:04, Gabriel Krisman Bertazi wrote:
Pavel Begunkov <asml.silence@xxxxxxxxx> writes:

On 3/31/23 15:09, Gabriel Krisman Bertazi wrote:
Pavel Begunkov <asml.silence@xxxxxxxxx> writes:

Add allocation cache for struct io_rsrc_node, it's always allocated and
put under ->uring_lock, so it doesn't need any extra synchronisation
around caches.
Hi Pavel,
I'm curious if you considered using kmem_cache instead of the custom
cache for this case?  I'm wondering if this provokes visible difference in
performance in your benchmark.

I didn't try it, but kmem_cache vs kmalloc, IIRC, doesn't bring us
much, definitely doesn't spare from locking, and the overhead
definitely wasn't satisfactory for requests before.

There is no locks in the fast path of slub, as far as I know.  it has a
per-cpu cache that is refilled once empty, quite similar to the fastpath
of this cache.  I imagine the performance hit in slub comes from the
barrier and atomic operations?

Yeah, I mean all kinds of synchronisation. And I don't think
that's the main offender here, the test is single threaded without
contention and the system was mostly idle.

kmem_cache works fine for most hot paths of the kernel.  I think this

It doesn't for io_uring. There are caches for the net side and now
in the block layer as well. I wouldn't say it necessarily halves
performance but definitely takes a share of CPU.

custom cache makes sense for the request cache, where objects are
allocated at an incredibly high rate.  but is this level of update
frequency a valid use case here?

I can think of some. For example it was of interest before to
install a file for just 2-3 IO operations and also fully bypassing
the normal file table. I rather don't see why we wouldn't use it.

If it is indeed a significant performance improvement, I guess it is
fine to have another user of the cache. But I'd be curious to know how
much of the performance improvement you mentioned in the cover letter is
due to this patch!

It was definitely sticking out in profiles, 5-10% of cycles, maybe more

--
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux