On Fri 10-02-17 08:39:11, peter enderborg wrote: > On 02/09/2017 09:05 PM, Michal Hocko wrote: > > On Thu 09-02-17 14:21:52, peter enderborg wrote: > >> Fundamental changes: > >> 1 Does NOT take any RCU lock in shrinker functions. > >> 2 It returns same result for scan and counts, so we dont need to do > >> shinker will know when it is pointless to call scan. > >> 3 It does not lock any other process than the one that is > >> going to be killed. > >> > >> Background. > >> The low memory killer scans for process that can be killed to free > >> memory. This can be cpu consuming when there is a high demand for > >> memory. This can be seen by analysing the kswapd0 task work. > >> The stats function added in earler patch adds a counter for waste work. > >> > >> How it works. > >> This patch create a structure within the lowmemory killer that caches > >> the user spaces processes that it might kill. It is done with a > >> sorted rbtree so we can very easy find the candidate to be killed, > >> and knows its properies as memory usage and sorted by oom_score_adj > >> to look up the task with highest oom_score_adj. To be able to achive > >> this it uses oom_score_notify events. > >> > >> This patch also as a other effect, we are now free to do other > >> lowmemorykiller configurations. Without the patch there is a need > >> for a tradeoff between freed memory and task and rcu locks. This > >> is no longer a concern for tuning lmk. This patch is not intended > >> to do any calculation changes other than we do use the cache for > >> calculate the count values and that makes kswapd0 to shrink other > >> areas. > > I have to admit I really do not understand big part of the above > > paragraph as well as how this all is supposed to work. A quick glance > > over the implementation. __lmk_task_insert seems to be only called from > > the oom_score notifier context. If nobody updates the value then no task > > will get into the tree. Or am I missing something really obvious here? > > Moreover oom scores tend to be mostly same for tasks. That means that > > your sorted tree will become sorted by pids in most cases. I do not see > > any sorting based on the rss nor any updates that would reflect updates > > of rss. How can this possibly work? > > The task tree nodes are created,updated or removed from the notifier when > there is a relevant oom_score_adj change. If no one create a task that > is in the range for the lowmemorykiller the tree will be empty. This is > an android feature so the score will be updated very often. It is > part of activity manager to prioritise tasks. Why should we do sort of > rss? Because the current lmk selects the tasks based on rss. And the patch doesn't explain why this is no longer suitable and a different metric shoult be used. If you also consider that the scale of oom_score_adj is quite small, conllisions when you simply sort based on pids which is more than questionable. I really fail to see how this can work reasonably and why the change of the lmk semantic is even acceptable. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>