Re: [PATCH RFC] ipc/mqueue: introduce msg cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 20, 2022 at 03:28:11PM -0800, Shakeel Butt wrote:
> On Tue, Dec 20, 2022 at 12:59 PM Roman Gushchin
> <roman.gushchin@xxxxxxxxx> wrote:
> >
> > On Tue, Dec 20, 2022 at 11:53:25AM -0800, Shakeel Butt wrote:
> > > +Vlastimil
> > >
> > > On Tue, Dec 20, 2022 at 10:48 AM Roman Gushchin
> > > <roman.gushchin@xxxxxxxxx> wrote:
> > > >
> > > > Sven Luther reported a regression in the posix message queues
> > > > performance caused by switching to the per-object tracking of
> > > > slab objects introduced by patch series ending with the
> > > > commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all
> > > > allocations").
> > > >
> > > > To mitigate the regression cache allocated mqueue messages on a small
> > > > percpu cache instead of releasing and re-allocating them every time.
> > > >
> > >
> > > Seems fine with me but I am wondering what is stopping us to do this
> > > caching in the slab layer for all accounted allocations? Does this
> > > only make sense for specific scenarios/use-cases?
> >
> > It's far from trivial, unfortunately. Here we have an mqueue object to associate
> > a percpu cache with and the hit rate is expected to be high, assuming the mqueue
> > will be used to pass a lot of messages.
> >
> > With a generic slab cache we return to the necessity of managing
> > the per-cgroup x per-slab-cache x per-cpu free list (or some other data structure),
> > which is already far from trivial, based on the previous experience. It can
> > easily lead to a significant memory waste, which will fully compensate all perf
> > wins.
> >
> > So probably we need some heuristics to allocate caches only for really hot slab
> > caches and use some sort of a hash map (keyed by cgroup and slab cache) to
> > access freelists. What I miss to commit more time to this project (aside from not
> > having it), is the lack of real workloads which will noticeably win from this work.
> >
> > Sven provided a good example and benchmark to reproduce the regression, so it
> > was easy to justify the work.
> >
> 
> Thanks for the explanation. I think we should add this to the commit
> message as well. I do think we should have a general framework for
> such caching as there are other users (e.g. io_uring) doing the same
> and some future users can take advantage as well e.g. I think this
> type of caching will be helpful for filelock_cache as well. Anyways
> that can be done in future.

I agree.

One way I'm thinking about is to provide an API for creating objcg-specific slab
caches. All objects belonging to a such slab cache will belong to the same objcg.
In this case the accounting can be even faster than the previous per-page
accounting. And the user of such API will be responsible for the lifetime
of the cache, as well as the decision whether to use the local or global
slab cache.

Thanks!




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux