On 8/29/24 18:10, Shakeel Butt wrote: > On Thu, Aug 29, 2024 at 11:42:10AM GMT, Vlastimil Babka wrote: >> On 8/28/24 01:52, Shakeel Butt wrote: >> > At the moment, the slab objects are charged to the memcg at the >> > allocation time. However there are cases where slab objects are >> > allocated at the time where the right target memcg to charge it to is >> > not known. One such case is the network sockets for the incoming >> > connection which are allocated in the softirq context. >> > >> > Couple hundred thousand connections are very normal on large loaded >> > server and almost all of those sockets underlying those connections get >> > allocated in the softirq context and thus not charged to any memcg. >> > However later at the accept() time we know the right target memcg to >> > charge. Let's add new API to charge already allocated objects, so we can >> > have better accounting of the memory usage. >> > >> > To measure the performance impact of this change, tcp_crr is used from >> > the neper [1] performance suite. Basically it is a network ping pong >> > test with new connection for each ping pong. >> > >> > The server and the client are run inside 3 level of cgroup hierarchy >> > using the following commands: >> > >> > Server: >> > $ tcp_crr -6 >> > >> > Client: >> > $ tcp_crr -6 -c -H ${server_ip} >> > >> > If the client and server run on different machines with 50 GBPS NIC, >> > there is no visible impact of the change. >> > >> > For the same machine experiment with v6.11-rc5 as base. >> > >> > base (throughput) with-patch >> > tcp_crr 14545 (+- 80) 14463 (+- 56) >> > >> > It seems like the performance impact is within the noise. >> > >> > Link: https://github.com/google/neper [1] >> > Signed-off-by: Shakeel Butt <shakeel.butt@xxxxxxxxx> >> > --- >> > v1: https://lore.kernel.org/all/20240826232908.4076417-1-shakeel.butt@xxxxxxxxx/ >> > Changes since v1: >> > - Correctly handle large allocations which bypass slab >> > - Rearrange code to avoid compilation errors for !CONFIG_MEMCG builds >> > >> > RFC: https://lore.kernel.org/all/20240824010139.1293051-1-shakeel.butt@xxxxxxxxx/ >> > Changes since the RFC: >> > - Added check for already charged slab objects. >> > - Added performance results from neper's tcp_crr >> > >> > include/linux/slab.h | 1 + >> > mm/slub.c | 51 +++++++++++++++++++++++++++++++++ >> > net/ipv4/inet_connection_sock.c | 5 ++-- >> > 3 files changed, 55 insertions(+), 2 deletions(-) >> >> I can take the v3 in slab tree, if net people ack? > > Thanks. > >> >> BTW, will this be also useful for Linus's idea of charging struct files only >> after they exist? But IIRC there was supposed to be also a part where we >> have a way to quickly determine if we're not over limit (while allowing some >> overcharge to make it quicker). >> > > Do you have link to those discussions or pointers to the code? From what > you have described, I think this should work. We have the relevant gfp > flags to control the charging behavior (with some caveats). I think this was the last part of the discussion, and in the cover letter of that there are links to the older threads for more context https://lore.kernel.org/all/CAHk-%3DwhgFtbTxCAg2CWQtDj7n6CEyzvdV1wcCj2qpMfpw0%3Dm1A@xxxxxxxxxxxxxx/ >> Because right now this just overcharges unconditionally, but that's >> understandable when the irq context creating the socket can't know the memcg >> upfront. In the open() situation this is different. >> > > For networking we deliberately overcharges in the irq context (if > needed) and the course correct in the task context. However networking > stack is very robust due to mechanisms like backoff, retransmit to handle > situations like packet drops, allocation failures, congestion etc. Other > subsystem are not that robust against ENOMEM. Once I have more detail I > can follow up on the struct files case. Ack. Agreed with Roman that it would be a separate followup change. > thanks, > Shakeel > >