On Tue, Nov 10, 2020 at 12:50:08PM -0800, Andrew Morton wrote: > On Tue, 10 Nov 2020 11:57:53 -0800 Roman Gushchin <guro@xxxxxx> wrote: > > > In general it's unknown in advance if a slab page will contain > > accounted objects or not. In order to avoid memory waste, an > > obj_cgroup vector is allocated dynamically when a need to account > > of a new object arises. Such approach is memory efficient, but > > requires an expensive cmpxchg() to set up the memcg/objcgs pointer, > > because an allocation can race with a different allocation on another > > cpu. > > > > But in some common cases it's known for sure that a slab page will > > contain accounted objects: if the page belongs to a slab cache with a > > SLAB_ACCOUNT flag set. It includes such popular objects like > > vm_area_struct, anon_vma, task_struct, etc. > > > > In such cases we can pre-allocate the objcgs vector and simple assign > > it to the page without any atomic operations, because at this early > > stage the page is not visible to anyone else. > > Was there any measurable performance change from this? A very simplistic benchmark (allocating 10000000 64-bytes objects in a row) shows ~15% win. In the real life it seems that most workloads are not very sensitive to the speed of (accounted) slab allocations.