On Wed, Jan 04, 2017 at 05:30:37PM -0800, Andrew Morton wrote: > > My simplified workflow looks like this: > > > > 1. Create a Memory CGroup with memory limit > > 2. Exec a child process > > 3. Add the child process PID into the Memory CGroup > > 4. Wait for the child process to finish > > 5. Remove the Memory CGroup > > > > The child processes usually run less than 0.1 seconds, but I have lots of them. > > Normally, I could run over 10000 child processes per minute, but with newer > > kernels, I can only do 400-500 executions per minute, and my system becomes > > extremely sluggish (the only indicator of the weirdness I found is an unusually > > high load average, which sometimes goes over 250!). > > > > Here is a simple reproduction script: > > > > #!/bin/sh > > CGROUP_BASE=/sys/fs/cgroup/memory/qq > > > > for $i in $(seq 1000); do > > echo "Iteration #$i" > > sh -c " > > mkdir '$CGROUP_BASE' > > sh -c 'echo \$$ > $CGROUP_BASE/tasks ; sleep 0.0' > > rmdir '$CGROUP_BASE' || true > > " > > done > > # === You're not even running anything concurrently. While I agree with Michal that cgroup creation and destruction are not the fastest paths, a load of 250 from a single-threaded testcase is silly. We recently had a load spikee issue with the on-demand memcg slab cache duplication, but that should have happened in 4.6 already. I don't see anything suspicious going into memcontrol.c after 4.6. When the load is high like this, can you check with ps what the blocked tasks are? A run with perf record -a also might give us an idea if cycles go to the wrong place. I'll try to reproduce this once I have access to my test machine again next week. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>