Re: [RFC v3 0/3] vmpressure_fd: Linux VM pressure notifications

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 07, 2012 at 01:21:36PM +0200, Kirill A. Shutemov wrote:
[...]
> Sorry, I didn't follow previous discussion on this, but could you
> explain what's wrong with memory notifications from memcg?
> As I can see you can get pretty similar functionality using memory
> thresholds on the root cgroup. What's the point?

There are a few reasons we don't use cgroup notifications:

1. We're not interested in the absolute number of pages/KB of available
   memory, as provided by cgroup memory controller. What we're interested
   in is the amount of easily reclaimable memory and new memory
   allocations' cost.

   We can have plenty of "free" memory, of which say 90% will be caches,
   and say 10% idle. But we do want to differentiate these types of memory
   (although not going into details about it), i.e. we want to get
   notified when kernel is reclaiming. And we also want to know when the
   memory comes from swapping others' pages out (well, actually we don't
   call it swap, it's "new allocations cost becomes high" -- it might be a
   result of many factors (swapping, fragmentation, etc.) -- and userland
   might analyze the situation when this happens).

   Exposing all the VM details to userland is not an option -- it is not
   possible to build a stable ABI on this. Plus, it makes it really hard
   for userland to deal with all the low level details of Linux VM
   internals.

   So, no, raw numbers of "free/used KBs" are not interesting at all.

1.5. But it is important to understand that vmpressure_fd() is not
     orthogonal to cgroups (like it was with vmevent_fd()). We want it to
     be "cgroup'able" too. :) But optionally.

2. The last time I checked, cgroups memory controller did not (and I guess
   still does not) not account kernel-owned slabs. I asked several times
   why so, but nobody answered.
   
   But no, this is not the main issue -- per "1.", we're not interested in
   kilobytes.

3. Some folks don't like cgroups: it has a penalty for kernel size, for
   performance and memory wastage. But again, it's not the main issue with
   memcg.

Thanks,
Anton.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]