On Fri 27-01-23 20:43:20, Hyeonggon Yoo wrote: > On Wed, Jan 25, 2023 at 10:51:05AM +0100, Michal Hocko wrote: > > On Thu 26-01-23 00:41:15, Hyeonggon Yoo wrote: > > [...] > > > > Do you happen to have any perf data collected during those runs? I > > > > would be interested in the memcg side of things. Maybe we can do > > > > something better there. > > > > > > Yes, below is performance data I've collected. > > > > > > 6.1.8-debug-preempt-dirty > > > ========================= > > > Overhead Command Shared Object Symbol > > > + 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled > > > > Thanks! Could you just add callers that are showing in the profile for > > this call please? > > - 14.56% 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled > - 6.37% check_preemption_disabled > + 3.48% mod_objcg_state > + 1.10% obj_cgroup_charge > 1.02% refill_obj_stock > 0.67% memcg_slab_post_alloc_hook > 0.58% mod_objcg_state > > According to perf, many memcg functions call this function > and that's because __this_cpu_xxxx checks if preemption is disabled. OK, I see. Thanks! I was thinking whether we can optimize for that bu IIUC __this_cpu* is already an optimized form. mod_objcg_state is already called with local_lock so raw_cpu* could be used in that path but I guess this is not really worth just to optimize for a debug compile option to benefit. -- Michal Hocko SUSE Labs