On Fri, Jan 12, 2018 at 4:24 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > On Fri 12-01-18 00:59:38, Andrey Ryabinin wrote: >> On 01/11/2018 07:29 PM, Michal Hocko wrote: > [...] >> > I do not think so. Consider that this reclaim races with other >> > reclaimers. Now you are reclaiming a large chunk so you might end up >> > reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over >> > reclaim to be negligible. >> > >> >> I did consider this. And I think, I already explained that sort of race in previous email. >> Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers, >> doesn't matter. That doesn't change anything. > > I would _really_ prefer two patches here. The first one removing the > hard coded reclaim count. That thing is just dubious at best. If you > _really_ think that the higher reclaim target is meaningfull then make > it a separate patch. I am not conviced but I will not nack it it either. > But it will make our life much easier if my over reclaim concern is > right and we will need to revert it. Conceptually those two changes are > independent anywa. > Personally I feel that the cgroup-v2 semantics are much cleaner for setting limit. There is no race with the allocators in the memcg, though oom-killer can be triggered. For cgroup-v1, the user does not expect OOM killer and EBUSY is expected on unsuccessful reclaim. How about we do something similar here and make sure oom killer can not be triggered for the given memcg? // pseudo code disable_oom(memcg) old = xchg(&memcg->memory.limit, requested_limit) reclaim memory until usage gets below new limit or retries are exhausted if (unsuccessful) { reset_limit(memcg, old) ret = EBUSY } else ret = 0; enable_oom(memcg) This way there is no race with the allocators and oom killer will not be triggered. The processes in the memcg can suffer but that should be within the expectation of the user. One disclaimer though, disabling oom for memcg needs more thought. Shakeel -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html