On Thu, Dec 22, 2022 at 02:50:44PM +0100, Michal Koutný wrote: > On Tue, Dec 20, 2022 at 10:27:45AM -0800, Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote: > > To charge a freshly allocated kernel object to a memory cgroup, the > > kernel needs to obtain an objcg pointer. Currently it does it > > indirectly by obtaining the memcg pointer first and then calling to > > __get_obj_cgroup_from_memcg(). > > Jinx [1]. > > You report additional 7% improvement with this patch (focused on > allocations only). I didn't see impressive numbers (different benchmark > in [1]), so it looked as a microoptimization without big benefit to me. Hi Michal! Thank you for taking a look. Do you have any numbers to share? In general, I agree that it's a micro-optimization, but: 1) some people periodically complain that accounted allocations are slow in comparison to non-accounted and slower than they were with page-based accounting, 2) I don't see any particular hot point or obviously non-optimal place on the allocation path. so if we want to make it faster, we have to micro-optimize it here and there, no other way. It's basically the question how many cache lines we touch. Btw, I'm working on a patch 3 for this series, which in early tests brings additional ~25% improvement in my benchmark, hopefully will post it soon as a part of v1. Thanks!