On Mon 16-12-13 10:53:45, Michal Hocko wrote: > On Mon 16-12-13 17:36:09, Li Zefan wrote: > > On 2013/12/16 16:36, Hugh Dickins wrote: > > > CONFIG_MEMCG_SWAP is broken in 3.13-rc. Try something like this: > > > > > > mkdir -p /tmp/tmpfs /tmp/memcg > > > mount -t tmpfs -o size=1G tmpfs /tmp/tmpfs > > > mount -t cgroup -o memory memcg /tmp/memcg > > > mkdir /tmp/memcg/old > > > echo 512M >/tmp/memcg/old/memory.limit_in_bytes > > > echo $$ >/tmp/memcg/old/tasks > > > cp /dev/zero /tmp/tmpfs/zero 2>/dev/null > > > echo $$ >/tmp/memcg/tasks > > > rmdir /tmp/memcg/old > > > sleep 1 # let rmdir work complete > > > mkdir /tmp/memcg/new > > > umount /tmp/tmpfs > > > dmesg | grep WARNING > > > rmdir /tmp/memcg/new > > > umount /tmp/memcg > > > > > > Shows lots of WARNING: CPU: 1 PID: 1006 at kernel/res_counter.c:91 > > > res_counter_uncharge_locked+0x1f/0x2f() > > > > > > Breakage comes from 34c00c319ce7 ("memcg: convert to use cgroup id"). > > > > > > The lifetime of a cgroup id is different from the lifetime of the > > > css id it replaced: memsw's css_get()s do nothing to hold on to the > > > old cgroup id, it soon gets recycled to a new cgroup, which then > > > mysteriously inherits the old's swap, without any charge for it. > > > (I thought memsw's particular need had been discussed and was > > > well understood when 34c00c319ce7 went in, but apparently not.) > > > > > > The right thing to do at this stage would be to revert that and its > > > associated commits; but I imagine to do so would be unwelcome to > > > the cgroup guys, going against their general direction; and I've > > > no idea how embedded that css_id removal has become by now. > > > > > > Perhaps some creative refcounting can rescue memsw while still > > > using cgroup id? > > > > > > > Sorry for the broken. > > > > I think we can keep the cgroup->id until the last css reference is > > dropped and the css is scheduled to be destroyed. > > How would this work? The task which pushed the memory to the swap is > still alive (living in a different group) and the swap will be there > after the last reference to css as well. Or did you mean to get css reference in swap_cgroup_record and release it in __mem_cgroup_try_charge_swapin? That would prevent the warning (assuming idr_remove would move to css_free[1]) but I am not sure this is the right thing to do. memsw charges will be accounted to the parent already (assuming there is one) without anybody to uncharge them because all uncharges would fallback to the root memcg after css_offline. Hugh's approach seems much better. --- [1] Is this even possible? I cannot say I would understand the comment above idr_remove in cgroup_destroy_css_killed 100% but it suggests we cannot postpone it to later. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>