Hello, On Wed, Oct 05, 2022 at 10:20:54AM -0700, Yosry Ahmed wrote: > > How long were the stalls? Given that rstats are usually flushed by its > > I think 10 seconds while interrupts are disabled is what we need for a > hard lockup, right? Oh man, that's a long while. I'd really like to learn more about the numbers. How many cgroups are being flushed across how many CPUs? > IIUC you mean that the caller of cgroup_rstat_flush() can call a > different variant that only flushes a part of the rstat tree then > returns, and the caller makes several calls interleaved by re-enabling > irq, right? Because the flushing code seems to already do this > internally if the non irqsafe version is used. I was thinking more that being done inside the flush function. > I think this might be tricky. In this case the path that caused the > lockup was memcg_check_events()->mem_cgroup_threshold()->__mem_cgroup_threshold()->mem_cgroup_usage()->mem_cgroup_flush_stats(). > Interrupts are disabled by callers of memcg_check_events(), but the > rstat flush call is made much deeper in the call stack. Whoever is > disabling interrupts doesn't have access to pause/resume flushing. Hmm.... yeah I guess it's worthwhile to experiment with selective flushing for specific paths. That said, we'd still need to address the whole flush taking long too. > There are also other code paths that used to use > cgroup_rstat_flush_irqsafe() directly before mem_cgroup_flush_stats() > was introduced like mem_cgroup_wb_stats() [1]. > > This is why I suggested a selective flushing variant of > cgroup_rstat_flush_irqsafe(), so that flushers that need irq disabled > have the ability to only flush a subset of the stats to avoid long > stalls if possible. I have nothing against selective flushing but it's not a free thing to do both in terms of complexity and runtime overhead, so let's get some numbers on how much time is spent where. Thanks. -- tejun