On Thu, Aug 18, 2011 at 2:40 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > (cc linux-arch) > > On Wed, 17 Aug 2011 23:50:53 -0700 > Greg Thelen <gthelen@xxxxxxxxxx> wrote: > >> Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were >> unnecessarily disabling preemption when adjusting per-cpu counters: >> preempt_disable() >> __this_cpu_xxx() >> __this_cpu_yyy() >> preempt_enable() >> >> This change does not disable preemption and thus CPU switch is possible >> within these routines. This does not cause a problem because the total >> of all cpu counters is summed when reporting stats. Now both >> mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like: >> this_cpu_xxx() >> this_cpu_yyy() >> >> ... >> >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem, >> static void mem_cgroup_charge_statistics(struct mem_cgroup *mem, >> bool file, int nr_pages) >> { >> - preempt_disable(); >> - >> if (file) >> - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages); >> + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages); >> else >> - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages); >> + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages); >> >> /* pagein of a big page is an event. So, ignore page size */ >> if (nr_pages > 0) >> - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); >> + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]); >> else { >> - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); >> + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]); >> nr_pages = -nr_pages; /* for event */ >> } >> >> - __this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); >> - >> - preempt_enable(); >> + this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages); >> } > > On non-x86 architectures this_cpu_add() internally does > preempt_disable() and preempt_enable(). So the patch is a small > optimisation for x86 and a larger deoptimisation for non-x86. > > I think I'll apply it, as the call frequency is low (correct?) and the > problem will correct itself as other architectures implement their > atomic this_cpu_foo() operations. mem_cgroup_charge_statistics() is a common operation, which is called on each memcg page charge and uncharge. The per arch/config effects of this patch: * non-preemptible kernels: there's no difference before/after this patch. * preemptible x86: this patch helps by removing an unnecessary preempt_disable/enable. * preemptible non-x86: this patch hurts by adding implicit preempt_disable/enable around each operation. So I am uncomfortable this patch's unmeasured impact on archs that do not have atomic this_cpu_foo() operations. Please drop the patch from mmotm. Sorry for the noise. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html