Re: [RFC PATCH 0/3] Disable swap cgroup allocation at system boot stage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 04-12-12 16:35:44, Jeff Liu wrote:
> Hello,

Hi,

> 
> Currently, we allocate pages for swap cgroup when the system is booting up.

That is not precise. We do that during _swapon_ time which usually
happen during boot but still...

> Which means that a particular size of pre-allocated memory(depending
> on the total size of the enabled swap files/partitions) would be
> wasted if there is no child memcg being alive.
>
> This patch set is intended to defer the memory allocation for swap
> cgroup until the first children of memcg was created. Actually, it was
> totally inspired by Glabuer's previous proposed patch set, which can
> be found at: "memcg: do not call page_cgroup_init at system_boot".
> http://lwn.net/Articles/517562/
>
> These patches works to me with some sanity check up.  There must
> have some issues I am not aware of for now, at least, performing
> swapon/swapoff when there have child memcg alives can run into some
> potential race conditions that would end up go into bad_page() path...

The locking is kind of awkward because there are indirect locking
dependencies which are not described anywhere.

> but I'd like to post it early to seek any directions if possible, so
> that I can continue to improve it.

Besides small things I have commented on already there is one bigger
issue. You are assuming that the swap tracking is applicable only
if there is at least one cgroup but the root. This is usually true
because we cannot set up any limit for the root cgroup. But this
doesn't consider that we do _tracking_ even though there is no limit
for the root cgroup. This is important as soon as a task is moved
to other group (with immigrate_on_move enabled). Your series would
cause that the swapped out pages wouldn't be migrated (have a look at
swap_cgroup_cmpxchg). Maybe this is can be tricked out somehow (e.g.
return a root cgroup id if the swap_cgroup is 0).

The patches should be also cleaned up a bit and changelogs enhanced to
be more descriptive. I would suggest the following split up
	- swap_cgroup_swapon should only setup ctrl->length and move the
	  rest to swap_cgroup_prepare
	- introduce memsw_accounting_users (have it 1 by default now)
	  and call swap_cgroup_prepare only if it is > 1. Same applies
	  to lookup_swap_cgroup
	- hackaround charge moving from the root cgroup
	- make memsw_accounting_users 0 by default and increment it on
	  a first non-root cgroup creation + call swap_cgroup_swapon.
	- (optionally) deallocate swap accounting structures on the last
	  non-root cgroup removal - I am not sure this is really
	  necessary but if it is then it should be done only after all
	  references to the memcg are gone rather than from
	  mem_cgroup_destroy

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]