On Fri 29-05-20 11:08:58, Chris Down wrote: > Michal Hocko writes: > > > > > task->memcg_nr_pages_over_high is not vague, it's a best-effort > > > > > mechanism to distribute fairness. It's the current task's share of the > > > > > cgroup's overage, and it allows us in the majority of situations to > > > > > distribute reclaim work and sleeps in proportion to how much the task > > > > > is actually at fault. > > > > > > > > Agreed. But this stops being the case as soon as the reclaim target has > > > > been reached and new reclaim attempts are enforced because the memcg is > > > > still above the high limit. Because then you have a completely different > > > > reclaim target - get down to the limit. This would be especially visible > > > > with a large memcg_nr_pages_over_high which could even lead to an over > > > > reclaim. > > > > > > We actually over reclaim even before this patch -- this patch doesn't bring > > > much new in that regard. > > > > > > Tracing try_to_free_pages for a cgroup at the memory.high threshold shows > > > that before this change, we sometimes even reclaim on the order of twice the > > > number of pages requested. For example, I see cases where we requested 1000 > > > pages to be reclaimed, but end up reclaiming 2000 in a single reclaim > > > attempt. > > > > This is interesting and worth looking into. I am aware that we can > > reclaim potentially much more pages during the icache reclaim and that > > there was a heated discussion without any fix merged in the end IIRC. > > Do you have any details? > > Sure, we can look into this more, but let's do it separately from this patch > -- I don't see that its merging should be contingent on that discussion :-) Yes that is a separate issue. -- Michal Hocko SUSE Labs