Re: [PATCH v3 3/5] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-02-18 09:25:29 [-0800], Shakeel Butt wrote:
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 0b5117ed2ae08..36ab3660f2c6d 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -630,6 +630,28 @@ static DEFINE_SPINLOCK(stats_flush_lock);
> >  static DEFINE_PER_CPU(unsigned int, stats_updates);
> >  static atomic_t stats_flush_threshold = ATOMIC_INIT(0);
> >
> > +/*
> > + * Accessors to ensure that preemption is disabled on PREEMPT_RT because it can
> > + * not rely on this as part of an acquired spinlock_t lock. These functions are
> > + * never used in hardirq context on PREEMPT_RT and therefore disabling preemtion
> > + * is sufficient.
> > + */
> > +static void memcg_stats_lock(void)
> > +{
> > +#ifdef CONFIG_PREEMPT_RT
> > +      preempt_disable();
> > +#else
> > +      VM_BUG_ON(!irqs_disabled());
> > +#endif
> > +}
> > +
> > +static void memcg_stats_unlock(void)
> > +{
> > +#ifdef CONFIG_PREEMPT_RT
> > +      preempt_enable();
> > +#endif
> > +}
> > +
> >  static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
> >  {
> >         unsigned int x;
> > @@ -706,6 +728,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
> >         pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
> >         memcg = pn->memcg;)
> >
> > +       memcg_stats_lock();
> 
> The call chains from rmap.c have not really disabled irqs. Actually
> there is a comment in do_page_add_anon_rmap() "We use the irq-unsafe
> __{inc|mod}_zone_page_stat because these counters are not modified in
> interrupt context, and pte lock(a spinlock) is held, which implies
> preemption disabled".
> 
> VM_BUG_ON(!irqs_disabled()) within memcg_stats_lock() would be giving
> false error reports for CONFIG_PREEMPT_NONE kernels.

So three caller, including do_page_add_anon_rmap():
   __mod_lruvec_page_state() -> __mod_lruvec_state() -> __mod_memcg_lruvec_state()

is affected. Here we get false warnings because interrupts may not be
disabled and it is intended. Hmmm.
The odd part is that this only affects certain idx so any kind of
additional debugging would need to take this into account.
What about memcg_rstat_updated()? It does:

|         x = __this_cpu_add_return(stats_updates, abs(val));
|         if (x > MEMCG_CHARGE_BATCH) {
|                 atomic_add(x / MEMCG_CHARGE_BATCH, &stats_flush_threshold);
|                 __this_cpu_write(stats_updates, 0);
|         }

The writes to stats_updates can happen from IRQ-context and with
disabled preemption only. So this is not good, right?

Sebastian




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux