On 2022-02-21 12:31:18 [+0100], To Shakeel Butt wrote: > > The call chains from rmap.c have not really disabled irqs. Actually > > there is a comment in do_page_add_anon_rmap() "We use the irq-unsafe > > __{inc|mod}_zone_page_stat because these counters are not modified in > > interrupt context, and pte lock(a spinlock) is held, which implies > > preemption disabled". > > > > VM_BUG_ON(!irqs_disabled()) within memcg_stats_lock() would be giving > > false error reports for CONFIG_PREEMPT_NONE kernels. > > So three caller, including do_page_add_anon_rmap(): > __mod_lruvec_page_state() -> __mod_lruvec_state() -> __mod_memcg_lruvec_state() > > is affected. Here we get false warnings because interrupts may not be > disabled and it is intended. Hmmm. > The odd part is that this only affects certain idx so any kind of > additional debugging would need to take this into account. > What about memcg_rstat_updated()? It does: > > | x = __this_cpu_add_return(stats_updates, abs(val)); > | if (x > MEMCG_CHARGE_BATCH) { > | atomic_add(x / MEMCG_CHARGE_BATCH, &stats_flush_threshold); > | __this_cpu_write(stats_updates, 0); > | } > > The writes to stats_updates can happen from IRQ-context and with > disabled preemption only. So this is not good, right? So I made the following to avoid the wrong assert. Still not sure how bad the hunk above. diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 97a88b63ee983..1bac4798b72ba 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -645,6 +645,13 @@ static void memcg_stats_lock(void) #endif } +static void __memcg_stats_lock(void) +{ +#ifdef CONFIG_PREEMPT_RT + preempt_disable(); +#endif +} + static void memcg_stats_unlock(void) { #ifdef CONFIG_PREEMPT_RT @@ -728,7 +735,20 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg = pn->memcg; - memcg_stats_lock(); + /* + * The caller from rmap relay on disabled preemption becase they never + * update their counter from in-interrupt context. For these two + * counters we check that the update is never performed from an + * interrupt context while other caller need to have disabled interrupt. + */ + __memcg_stats_lock(); + if (IS_ENABLED(CONFIG_DEBUG_VM)) { + if (idx == NR_ANON_MAPPED || idx == NR_FILE_MAPPED) + WARN_ON_ONCE(!in_task()); + else + WARN_ON_ONCE(!irqs_disabled()); + } + /* Update memcg */ __this_cpu_add(memcg->vmstats_percpu->state[idx], val); Sebastian