On Thu, Jul 03, 2014 at 04:48:16PM +0400, Vladimir Davydov wrote: > Hi, > > Typically, when a process calls mmap, it isn't given all the memory pages it > requested immediately. Instead, only its address space is grown, while the > memory pages will be actually allocated on the first use. If the system fails > to allocate a page, it will have no choice except invoking the OOM killer, > which may kill this or any other process. Obviously, it isn't the best way of > telling the user that the system is unable to handle his request. It would be > much better to fail mmap with ENOMEM instead. > > That's why Linux has the memory overcommit control feature, which accounts and > limits VM size that may contribute to mem+swap, i.e. private writable mappings > and shared memory areas. However, currently it's only available system-wide, > and there's no way of avoiding OOM in cgroups. > > This patch set is an attempt to fill the gap. It implements the resource > controller for cgroups that accounts and limits address space allocations that > may contribute to mem+swap. > > The interface is similar to the one of the memory cgroup except it controls > virtual memory usage, not actual memory allocation: > > vm.usage_in_bytes current vm usage of processes inside cgroup > (read-only) > > vm.max_usage_in_bytes max vm.usage_in_bytes, can be reset by writing 0 > > vm.limit_in_bytes vm.usage_in_bytes must be <= vm.limite_in_bytes; > allocations that hit the limit will be failed > with ENOMEM > > vm.failcnt number of times the limit was hit, can be reset > by writing 0 > > In future, the controller can be easily extended to account for locked pages > and shmem. Any thoughts on this? Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>