On Tue, Jan 15, 2019 at 11:06 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Wed 16-01-19 11:52:08, Fam Zheng wrote: > [...] > > > This is what force_empty is supposed to do. But, as your test shows > > > some page cache may still remain after force_empty, then cause offline > > > memcgs accumulated. I haven't figured out what happened. You may try > > > what Michal suggested. > > > > None of the existing patches helped so far, but we suspect that the > > pages cannot be locked at the force_empty moment. We have being > > working on a “retry” patch which does solve the problem. We’ll > > do more tracing (to have a better understanding of the issue) and post > > the findings and/or the patch later. Thanks. > > Just for the record. There was a patch to remove > MEM_CGROUP_RECLAIM_RETRIES restriction in the path. I cannot find the > link right now but that is something we certainly can do. The context is > interruptible by signal and it from my experience any retry count can Do you mean this one https://lore.kernel.org/patchwork/patch/865835/ ? I think removing retries is feasible as long as exit is handled correctly. Yang > lead to unexpected failures. But I guess you really want to check > vmscan tracepoints to see why you cannot reclaim pages on memcg LRUs > first. > -- > Michal Hocko > SUSE Labs