On Tue, Aug 27, 2024 at 4:52 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > At the moment, the slab objects are charged to the memcg at the > allocation time. However there are cases where slab objects are > allocated at the time where the right target memcg to charge it to is > not known. One such case is the network sockets for the incoming > connection which are allocated in the softirq context. > > Couple hundred thousand connections are very normal on large loaded > server and almost all of those sockets underlying those connections get > allocated in the softirq context and thus not charged to any memcg. > However later at the accept() time we know the right target memcg to > charge. Let's add new API to charge already allocated objects, so we can > have better accounting of the memory usage. > > To measure the performance impact of this change, tcp_crr is used from > the neper [1] performance suite. Basically it is a network ping pong > test with new connection for each ping pong. > > The server and the client are run inside 3 level of cgroup hierarchy > using the following commands: > > Server: > $ tcp_crr -6 > > Client: > $ tcp_crr -6 -c -H ${server_ip} > > If the client and server run on different machines with 50 GBPS NIC, > there is no visible impact of the change. > > For the same machine experiment with v6.11-rc5 as base. > > base (throughput) with-patch > tcp_crr 14545 (+- 80) 14463 (+- 56) > > It seems like the performance impact is within the noise. > > Link: https://github.com/google/neper [1] > Signed-off-by: Shakeel Butt <shakeel.butt@xxxxxxxxx> > --- > v1: https://lore.kernel.org/all/20240826232908.4076417-1-shakeel.butt@xxxxxxxxx/ > Changes since v1: > - Correctly handle large allocations which bypass slab > - Rearrange code to avoid compilation errors for !CONFIG_MEMCG builds > > RFC: https://lore.kernel.org/all/20240824010139.1293051-1-shakeel.butt@xxxxxxxxx/ > Changes since the RFC: > - Added check for already charged slab objects. > - Added performance results from neper's tcp_crr > > include/linux/slab.h | 1 + > mm/slub.c | 51 +++++++++++++++++++++++++++++++++ > net/ipv4/inet_connection_sock.c | 5 ++-- > 3 files changed, 55 insertions(+), 2 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index eb2bf4629157..05cfab107c72 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -547,6 +547,7 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, > gfp_t gfpflags) __assume_slab_alignment __malloc; > #define kmem_cache_alloc_lru(...) alloc_hooks(kmem_cache_alloc_lru_noprof(__VA_ARGS__)) > > +bool kmem_cache_charge(void *objp, gfp_t gfpflags); > void kmem_cache_free(struct kmem_cache *s, void *objp); > > kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags, > diff --git a/mm/slub.c b/mm/slub.c > index c9d8a2497fd6..8265ea5f25be 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2185,6 +2185,43 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, > > __memcg_slab_free_hook(s, slab, p, objects, obj_exts); > } > + > +#define KMALLOC_TYPE (SLAB_KMALLOC | SLAB_CACHE_DMA | \ > + SLAB_ACCOUNT | SLAB_RECLAIM_ACCOUNT) > + > +static __fastpath_inline > +bool memcg_slab_post_charge(void *p, gfp_t flags) > +{ > + struct slabobj_ext *slab_exts; > + struct kmem_cache *s; > + struct folio *folio; > + struct slab *slab; > + unsigned long off; > + > + folio = virt_to_folio(p); > + if (!folio_test_slab(folio)) { > + return __memcg_kmem_charge_page(folio_page(folio, 0), flags, > + folio_order(folio)) == 0; > + } > + > + slab = folio_slab(folio); > + s = slab->slab_cache; > + > + /* Ignore KMALLOC_NORMAL cache to avoid circular dependency. */ > + if ((s->flags & KMALLOC_TYPE) == SLAB_KMALLOC) > + return true; Taking a step back here, why do we need this? Which circular dependency are we avoiding here?