On Thu 25-02-21 14:48:58, Tim Chen wrote: > > > On 2/24/21 3:53 AM, Michal Hocko wrote: > > On Mon 22-02-21 11:48:37, Tim Chen wrote: > >> > >> > >> On 2/22/21 11:09 AM, Michal Hocko wrote: > >> > >>>> > >>>> I actually have tried adjusting the threshold but found that it doesn't work well for > >>>> the case with unenven memory access frequency between cgroups. The soft > >>>> limit for the low memory event cgroup could creep up quite a lot, exceeding > >>>> the soft limit by hundreds of MB, even > >>>> if I drop the SOFTLIMIT_EVENTS_TARGET from 1024 to something like 8. > >>> > >>> What was the underlying reason? Higher order allocations? > >>> > >> > >> Not high order allocation. > >> > >> The reason was because the run away memcg asks for memory much less often, compared > >> to the other memcgs in the system. So it escapes the sampling update and > >> was not put onto the tree and exceeds the soft limit > >> pretty badly. Even if it was put onto the tree and gets page reclaimed below the > >> limit, it could escape the sampling the next time it exceeds the soft limit. > > > > I am sorry but I really do not follow. Maybe I am missing something > > obvious but the the rate of events (charge/uncharge) shouldn't be really > > important. There is no way to exceed the limit without charging memory > > (either a new or via task migration in v1 and immigrate_on_move). If you > > have SOFTLIMIT_EVENTS_TARGET 8 then you should be 128 * 8 events to > > re-evaluate. Huge pages can make the runaway much bigger but how it > > would be possible to runaway outside of that bound. > > > Michal, > > Let's take an extreme case where memcg 1 always generate the > first event and memcg 2 generates the rest of 128*8-1 events > and the pattern repeat. I do not follow. Events are per-memcg, aren't they? __this_cpu_read(memcg->vmstats_percpu->targets[target]); [...] __this_cpu_write(memcg->vmstats_percpu->targets[target], next); -- Michal Hocko SUSE Labs