On Mon, Oct 17, 2022 at 11:52 AM Michal Koutný <mkoutny@xxxxxxxx> wrote: > > Hello. > > On Tue, Oct 04, 2022 at 06:17:40PM -0700, Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > Sorry for the long email :) > > (I'll get to other parts sometime in the future. Sorry for my latency :) > > > We have recently ran into a hard lockup on a machine with hundreds of > > CPUs and thousands of memcgs during an rstat flush. > > [...] > > I only respond with some remarks to this particular case. > > > > As you can imagine, with a sufficiently large number of > > memcgs and cpus, a call to mem_cgroup_flush_stats() might be slow, or > > in an extreme case like the one we ran into, cause a hard lockup > > (despite periodically flushing every 4 seconds). > > Is this your modification from the upstream value of FLUSH_TIME (that's > every 2 s)? It's actually once every 4s like upstream, I got confused by flush_next_time multiplying the flush interval by 2. > > In the mailthread, you also mention >10s for hard-lockups. That sounds > scary (even with the once per 4 seconds) since with large enough update > tree (and update activity) periodic flush couldn't keep up. > Also, it seems to be kind of bad feedback, the longer a (periodic) flush > takes, the lower is the frequency of them and the more updates may > accumulate. I.e. one spike in update activity can get the system into > a spiral of long flushes that won't recover once the activity doesn't > drop much more. Yeah it is scary and shouldn't be likely to happen, but it did :( We can keep coming up with mitigations to try and make it less likely, but I was hoping we can find something more fundamental like keeping track of what we really need to flush or avoiding all flushing in non-sleepable contexts if possible. > > (2nd point should have been about some memcg_check_events() optimization > or THRESHOLDS_EVENTS_TARGET justifying delayed flush but I've found none to be applicable. > Just noting that v2 fortunetly doesn't have the threshold > notifications.) I think even without that, we can still run into the same problem in other non-sleepable flushing contexts. > > Regards, > Michal