On Wed, Nov 27, 2019 at 7:11 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 27.11.19 02:28, Yafang Shao wrote: > > Let me give this patch description an overhaul: > Well done! Thanks for your work. > > There's one case that the processes in a memcg are all exit (due to OOM > > group or some other reasons), but the file page caches are still exist. > > "When there are no more processes in a memcg (e.g., due to OOM > group), we can still have file pages in the page cache." > > > These file page caches may be protected by memory.min so can't be > > reclaimed. If we can't success to restart the processes in this memcg or > > don't want to make this memcg offline, then we want to drop the file page > > caches. > > "If these pages are protected by memory.min, they can't be reclaimed. > Especially if there won't be another process in this memcg and the memcg > is kept online, we do want to drop these pages from the page cache." > > > The advantage of droping this file caches is it can avoid the reclaimer > > (either kswapd or direct) scanning and reclaiming pages from all memcgs > > exist in this system, because currently the reclaimer will fairly reclaim > > pages from all memcgs if the system is under memory pressure. > > "By dropping these page caches we can avoid reclaimers (e.g., kswapd or > direct) to scan and reclaim pages from all memcgs in the system - > because the reclaimers will try to fairly reclaim pages from all memcgs > in the system when under memory pressure." > > > The possible method to drop these file page caches is setting the > > hard limit of this memcg to 0. Unfortunately this may invoke the OOM killer > > and generates lots of outputs, that should not happen. > > The OOM output is not expected by the admin if he or she wants to drop > > the cahes and knows there're no processes in this memcg. > > "By setting the hard limit of such a memcg to 0, we allow to drop the > page cache of such memcgs. Unfortunately, this may invoke the OOM killer > and generate a lot of output. The OOM output is not expected by an admin > who wants to drop these caches and knows that there are no processes in > this memcg anymore." > > > > > If memcg is not populated, we should not invoke the OOM killer because > > there's nothing to kill. Next time when you start a new process and if the > > max is still bellow usage, the OOM killer will be invoked and your new > > process is killed, so we can cosider it as lazy OOM, that is we have been > > always doing in the kernel. > > "Therefore, if a memcg is not populated, we should not invoke the OOM > killer - there is nothing to kill. The next time a new process is > started in the memcg and the "max" is still below usage, the OOM killer > will be invoked and the new process will be killed." > > 1. I don't think the "lazy OOM" part is relevant. > That doesn't imporatant. > 2. Where is the part that modifies the limits? or did you drop that? is > it part of another patch? > No. it is not part of another patch. Modifying the limits is really a workaround that Michal[1] has told me to fix my problem, while actually it doesn't work, that is why I submit this patch. 1. https://lore.kernel.org/linux-mm/20191126073129.GA20912@xxxxxxxxxxxxxx/ > 3. I think I agree with Michal that modifying the limits smells more > like a configuration thingy to be handled by an admin (especially, adapt > min/max properly). But again, not sure where that change is located :) > I agree with you all, but that is Michal told me to do. See above and the disccussion in this thread. > 4. This patch on its own (if there are no processes, there is nothing to > kill) does not sound too wrong to me. Instead of an endless loop > (besides signals) where we can't make any progress, we exit right away. > Thanks for you feedback. > (I am not yet too familiar with memgc, Michal is clearly the expert :) ) > I agree with you that Michal is an expert, but clearly that Michal is not an expert on this issue. > > > > Fixes: b6e6edcf ("mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage") > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > > Cc: Michal Hocko <mhocko@xxxxxxxx> > > --- > > mm/memcontrol.c | 15 +++++++++++++-- > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 1c4c08b..e936f1b 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -6139,9 +6139,20 @@ static ssize_t memory_max_write(struct kernfs_open_file *of, > > continue; > > } > > > > - memcg_memory_event(memcg, MEMCG_OOM); > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > + /* If there's no procesess, we don't need to invoke the OOM > > + * killer. Then next time when you try to start a process > > + * in this memcg, the max may still bellow usage, and then > > + * this OOM killer will be invoked. This can be considered > > + * as lazy OOM, that is we have been always doing in the > > + * kernel. Pls. Michal, that is really consistency. > > + */ > > + if (cgroup_is_populated(memcg->css.cgroup)) { > > + memcg_memory_event(memcg, MEMCG_OOM); > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > + break; > > + } else { > > break; > > + } > > } > > > > memcg_wb_domain_size_changed(memcg); > > > > > -- > Thanks, > > David / dhildenb > Thanks Yafang