On Tue, Dec 07, 2021 at 04:52:08PM +0100, Sebastian Andrzej Siewior wrote: > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > MEMCG has a few constructs which are not compatible with PREEMPT_RT's > requirements. This includes: > - relying on disabled interrupts from spin_lock_irqsave() locking for > something not related to lock itself (like the per-CPU counter). If memory serves me right, this is the VM_BUG_ON() in workingset.c: VM_WARN_ON_ONCE(!irqs_disabled()); /* For __inc_lruvec_page_state */ This isn't memcg specific. This is the serialization model of the generic MM page counters. They can be updated from process and irq context, and need to avoid preemption (and corruption) during RMW. !CONFIG_MEMCG: static inline void mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val) { struct page *page = virt_to_head_page(p); mod_node_page_state(page_pgdat(page), idx, val); } which does: void mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, long delta) { unsigned long flags; local_irq_save(flags); __mod_node_page_state(pgdat, item, delta); local_irq_restore(flags); } If this breaks PREEMPT_RT, it's broken without memcg too. > - explicitly disabling interrupts and acquiring a spinlock_t based lock > like in memcg_check_events() -> eventfd_signal(). Similar problem to the above: we disable interrupts to protect RMW sequences that can (on non-preemptrt) be initiated through process context as well as irq context. IIUC, the PREEMPT_RT construct for handling exactly that scenario is the "local lock". Is that correct? It appears Ingo has already fixed the LRU cache, which for non-rt also relies on irq disabling: commit b01b2141999936ac3e4746b7f76c0f204ae4b445 Author: Ingo Molnar <mingo@xxxxxxxxxx> Date: Wed May 27 22:11:15 2020 +0200 mm/swap: Use local_lock for protection The memcg charge cache should be fixable the same way. Likewise, if you fix the generic vmstat counters like this, the memcg implementation can follow suit.