On 3/16/21 7:02 PM, Vlastimil Babka wrote: > On 3/16/21 11:42 AM, Xunlei Pang wrote: >> On 3/16/21 2:49 AM, Vlastimil Babka wrote: >>> On 3/9/21 4:25 PM, Xunlei Pang wrote: >>>> count_partial() can hold n->list_lock spinlock for quite long, which >>>> makes much trouble to the system. This series eliminate this problem. >>> >>> Before I check the details, I have two high-level comments: >>> >>> - patch 1 introduces some counting scheme that patch 4 then changes, could we do >>> this in one step to avoid the churn? >>> >>> - the series addresses the concern that spinlock is being held, but doesn't >>> address the fact that counting partial per-node slabs is not nearly enough if we >>> want accurate <active_objs> in /proc/slabinfo because there are also percpu >>> slabs and per-cpu partial slabs, where we don't track the free objects at all. >>> So after this series while the readers of /proc/slabinfo won't block the >>> spinlock, they will get the same garbage data as before. So Christoph is not >>> wrong to say that we can just report active_objs == num_objs and it won't >>> actually break any ABI. >> >> If maintainers don't mind this inaccuracy which I also doubt its >> importance, then it becomes easy. For fear that some people who really >> cares, introducing an extra config(default-off) for it would be a good >> option. > > Great. > >>> At the same time somebody might actually want accurate object statistics at the >>> expense of peak performance, and it would be nice to give them such option in >>> SLUB. Right now we don't provide this accuracy even with CONFIG_SLUB_STATS, >>> although that option provides many additional tuning stats, with additional >>> overhead. >>> So my proposal would be a new config for "accurate active objects" (or just tie >>> it to CONFIG_SLUB_DEBUG?) that would extend the approach of percpu counters in >>> patch 4 to all alloc/free, so that it includes percpu slabs. Without this config >>> enabled, let's just report active_objs == num_objs. >> For percpu slabs, the numbers can be retrieved from the existing >> slub_percpu_partial()->pobjects, looks no need extra work. > > Hm, unfortunately it's not that simple, the number there is a snapshot that can > become wildly inacurate afterwards. > It's hard to make it absoultely accurate using percpu, the data can change during you iterating all the cpus and total_objects, I can't imagine its real-world usage, not to mention the percpu freelist cache. I think sysfs slabs_cpu_partial should work enough for common debug purpose.