On Thu, Feb 9, 2017 at 4:46 PM, Vinayak Menon <vinmenon@xxxxxxxxxxxxxx> wrote: > Hi, > > We were trying to implement the per app memory cgroup that Johannes > suggested (https://lkml.org/lkml/2014/12/19/358) and later discussed during > Minchan's proposal of per process reclaim Per app memory cgroups are interesting, but also quite aggressive. Could you please describe what tasks/workload you have? > (https://lkml.org/lkml/2016/6/13/570). The test was done on Android target > with 2GB of RAM and cgroupv1. The first test done was to just create per > app cgroups without modifying any cgroup controls. 2 kinds of tests were > done which gives similar kind of observation. One was to just open > applications in sequence and repeat this N times (20 apps, so around 20 > memcgs max at a time). Another test was to create around 20 cgroups and > perform a make (not kernel, another less heavy source) in each of them. > > It is observed that because of the creation of memcgs per app, the per > memcg LRU size is so low and results in kswapd priority drop. This results > in sudden increase in scan at lower priorities. Because of this, kswapd > consumes around 3 times more time (and thus less pageoutrun), and due to > the lag in reclaiming memory direct reclaims are more and consumes around > 2.5 times more time. > That does not sound good! Have you been able to test this with older kernels to see if this is a regression? > Another observation is that the reclaim->generation check in > mem_cgroup_iter results in kswapd breaking the memcg lru reclaim loop in > shrink_zone (this is 4.4 kernel) often. This also contributes to the > priority drop. A test was done to skip the reclaim generation check in > mem_cgroup_iter and allow concurrent reclaimers to run at same priority. > This improved the results reducing the kswapd priority drops (and thus time > spent in kswapd, allocstalls etc). But this problem could be a side effect > of kswapd running for long and reclaiming slow resulting in many parallel > direct reclaims. > > Some of the stats are shown below > base per-app-memcg > > pgalloc_dma 4982349 5043134 > > pgfree 5249224 5303701 > > pgactivate 83480 117088 > > pgdeactivate 152407 1021799 > > pgmajfault 421 31698 > > pgrefill_dma 156884 1027744 > > pgsteal_kswapd_dma 128449 97364 > > pgsteal_direct_dma 101012 229924 > > pgscan_kswapd_dma 132716 109750 > > pgscan_direct_dma 104141 265515 > > slabs_scanned 58782 116886 > > pageoutrun 57 16 > > allocstall 1283 3540 > > > After this, offloading some of the job to soft reclaim was tried with the > assumption that it will result in lesser priority drops. The problem is in > determining the right value to be set for soft reclaim. For e.g. one of the > main motives behind using memcg in Android is to set different swappiness > to tasks depending on their importance (foreground, background etc.). In > such a case we actually do not want to set any soft limits. And in the > second case when we want to use soft reclaim to offload some work from > kswapd_shrink_zone on to mem_cgroup_soft_limit_reclaim, it becomes tricky > to set the soft limit values. I was trying out with different percentage of > task RSS for setting soft limit, but this actually results in excessive > scanning by mem_cgroup_soft_limit_reclaim, which as I understand is > because of always using scan priority of 0. This in turn increases the time > spent in kswapd. It reduces the kswapd priority drop though. > Soft limit setting can be tricky, but my advise is to set it based on how much you see a particular cgroup using when the system is under memory pressure. > Is there a way to mitigate this problem of small lru sizes, priority drop > and kswapd cpu consumption. > I've not investigated or heard of this problem before, so I am not sure if I have a solution for you. Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>