On Mon, Jul 21, 2014 at 01:46:55PM +0200, Michal Hocko wrote: > On Mon 21-07-14 11:07:24, Michal Hocko wrote: > > On Fri 18-07-14 19:44:43, Vladimir Davydov wrote: > > > On Wed, Jul 16, 2014 at 11:58:14AM -0400, Johannes Weiner wrote: > > > > On Wed, Jul 16, 2014 at 04:39:38PM +0200, Michal Hocko wrote: > > > > > +#ifdef CONFIG_MEMCG_KMEM > > > > > + { > > > > > + .name = "kmem.limit_in_bytes", > > > > > + .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT), > > > > > + .write = mem_cgroup_write, > > > > > + .read_u64 = mem_cgroup_read_u64, > > > > > + }, > > > > > > > > Does it really make sense to have a separate limit for kmem only? > > > > IIRC, the reason we introduced this was that this memory is not > > > > reclaimable and so we need to limit it. > > > > > > > > But the opposite effect happened: because it's not reclaimable, the > > > > separate kmem limit is actually unusable for any values smaller than > > > > the overall memory limit: because there is no reclaim mechanism for > > > > that limit, once you hit it, it's over, there is nothing you can do > > > > anymore. The problem isn't so much unreclaimable memory, the problem > > > > is unreclaimable limits. > > > > > > > > If the global case produces memory pressure through kernel memory > > > > allocations, we reclaim page cache, anonymous pages, inodes, dentries > > > > etc. I think the same should happen for kmem: kmem should just be > > > > accounted and limited in the overall memory limit of a group, and when > > > > pressure arises, we go after anything that's reclaimable. > > > > > > Personally, I don't think there's much sense in having a separate knob > > > for kmem limit either. Until we have a user with a sane use case for it, > > > let's not propagate it to the new interface. > > > > What about fork-bomb forks protection? I thought that was the primary usecase > > for K < U? Or how can we handle that use case with a single limit? A > > special gfp flag to not trigger OOM path when called from some kmem > > charge paths? > > Even then, I do not see how would this fork-bomb prevention work without > causing OOMs and killing other processes within the group. The danger > would be still contained in the group and prevent from the system wide > disruption. Do we really want only such a narrow usecase? I think it's all about how we're going to use memory cgroups. If we're going to use them for application containers, there's simply no such problem, because we only want to isolate a potentially dangerous process group from the rest of the system. If we want to start a fully virtualized OS inside a container, then we certainly need a kind of numproc and/or kmem limiter to prevent processes inside a cgroup from being OOM killed by a fork-bomb. IMHO, the latter will always be better done by VMs, so it isn't a must-have for cgroups. I may be mistaken though. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>