On 5/7/20 7:25 AM, Konstantin Khlebnikov wrote: > On 06/05/2020 14.56, Vlastimil Babka wrote: >> On 5/4/20 6:07 PM, Konstantin Khlebnikov wrote: >>> To get exact count of free and used objects slub have to scan list of >>> partial slabs. This may take at long time. Scanning holds spinlock and >>> blocks allocations which move partial slabs to per-cpu lists and back. >>> >>> Example found in the wild: >>> >>> # cat /sys/kernel/slab/dentry/partial >>> 14478538 N0=7329569 N1=7148969 >>> # time cat /sys/kernel/slab/dentry/objects >>> 286225471 N0=136967768 N1=149257703 >>> >>> real 0m1.722s >>> user 0m0.001s >>> sys 0m1.721s >>> >>> The same problem in slab was addressed in commit f728b0a5d72a ("mm, slab: >>> faster active and free stats") by adding more kmem cache statistics. >>> For slub same approach requires atomic op on fast path when object frees. >> >> In general yeah, but are you sure about this one? AFAICS this is about pages in >> the n->partial list, where manipulations happen under n->list_lock and shouldn't >> be fast path. It should be feasible to add a counter under the same lock, so it >> wouldn't even need to be atomic? > > SLUB allocates objects from prepared per-cpu slabs, they could be subtracted from > count of free object under this lock in advance when slab moved out of this list. > > But at freeing path object might belong to any slab, including global partials. Right, freeing can indeed modify a global partial without taking the lock. Nevermind then.