Re: [RFC] Making memcg track ownership per address_space or anon_vma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 05 2015, Tejun Heo wrote:

> Hey,
>
> On Thu, Feb 05, 2015 at 02:05:19PM -0800, Greg Thelen wrote:
>> >  	A
>> >  	+-B    (usage=2M lim=3M min=2M hosted_usage=2M)
>> >  	  +-C  (usage=0  lim=2M min=1M shared_usage=2M)
>> >  	  +-D  (usage=0  lim=2M min=1M shared_usage=2M)
>> >  	  \-E  (usage=0  lim=2M min=0)
> ...
>> Maybe, but I want to understand more about how pressure works in the
>> child.  As C (or D) allocates non shared memory does it perform reclaim
>> to ensure that its (C.usage + C.shared_usage < C.lim).  Given C's
>
> Yes.
>
>> shared_usage is linked into B.LRU it wouldn't be naturally reclaimable
>> by C.  Are you thinking that charge failures on cgroups with non zero
>> shared_usage would, as needed, induce reclaim of parent's hosted_usage?
>
> Hmmm.... I'm not really sure but why not?  If we properly account for
> the low protection when pushing inodes to the parent, I don't think
> it'd break anything.  IOW, allow the amount beyond the sum of low
> limits to be reclaimed when one of the sharers is under pressure.
>
> Thanks.

I'm not saying that it'd break anything.  I think it's required that
children perform reclaim on shared data hosted in the parent.  The child
is limited by shared_usage, so it needs ability to reclaim it.  So I
think we're in agreement.  Child will reclaim parent's hosted_usage when
the child is charged for shared_usage.  Ideally the only parental memory
reclaimed in this situation would be shared.  But I think (though I
can't claim to have followed the new memcg philosophy discussions) that
internal nodes in the cgroup tree (i.e. parents) do not have any
resources charged directly to them.  All resources are charged to leaf
cgroups which linger until resources are uncharged.  Thus the LRUs of
parent will only contain hosted (shared) memory.  This thankfully focus
parental reclaim easy on shared pages.  Child pressure will,
unfortunately, reclaim shared pages used by any container.  But if
shared pages were charged all sharing containers, then it will help
relieve pressure in the caller.

So  this is  a system  which charges  all cgroups  using a  shared inode
(recharge on read) for all resident pages of that shared inode.  There's
only one copy of the page in memory on just one LRU, but the page may be
charged to multiple container's (shared_)usage.

Perhaps I missed it, but what happens when a child's limit is
insufficient to accept all pages shared by its siblings?  Example
starting with 2M cached of a shared file:

	A
	+-B    (usage=2M lim=3M hosted_usage=2M)
	  +-C  (usage=0  lim=2M shared_usage=2M)
	  +-D  (usage=0  lim=2M shared_usage=2M)
	  \-E  (usage=0  lim=1M shared_usage=0)

If E faults in a new 4K page within the shared file, then E is a sharing
participant so it'd be charged the 2M+4K, which pushes E over it's
limit.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]