Re: [PATCH V7 2/2] mm: memcg detect no memcgs above softlimit under zone reclaim

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 01-08-12 16:10:32, Rik van Riel wrote:
> On 08/01/2012 03:04 PM, Ying Han wrote:
> 
> >That is true. Hmm, then two things i can do:
> >
> >1. for kswapd case, make sure not counting the root cgroup
> >2. or check nr_scanned. I like the nr_scanned which is telling us
> >whether or not the reclaim ever make any attempt ?
> 
> I am looking at a more advanced case of (3) right
> now.  Once I have the basics working, I will send
> you a prototype (that applies on top of your patches)
> to play with.
> 
> Basically, for every LRU in the system, we can keep
> track of 4 things:
> - reclaim_stat->recent_scanned
> - reclaim_stat->recent_rotated
> - reclaim_stat->recent_pressure
> - LRU size
> 
> The first two represent the fraction of pages on the
> list that are actively used.  The larger the fraction
> of recently used pages, the more valuable the cache
> is. The inverse of that can be used to show us how
> hard to reclaim this cache, compared to other caches
> (everything else being equal).
> 
> The recent pressure can be used to keep track of how
> many pages we have scanned on each LRU list recently.
> Pressure is scaled with LRU size.
> 
> This would be the basic formula to decide which LRU
> to reclaim from:
> 
>           recent_scanned   LRU size
> score =   -------------- * ----------------
>           recent_rotated   recent_pressure
> 
> 
> In other words, the less the objects on an LRU are
> used, the more we should reclaim from that LRU. The
> larger an LRU is, the more we should reclaim from
> that LRU.

The formula makes sense but I am afraid that it will be hard to tune it
into something that wouldn't regress. For example I have seen workloads
which had many small groups which are used to wrap up backup jobs and
those are scanned a lot, you would see also many rotations because of
the writeback but those are definitely good to scan rather than a large
group which needs to keep its data resident.
Anyway, I am not saying the score approach is a bad idea but I am afraid
it will be hard to validate and make it right.

> The more we have already scanned an LRU, the lower
> its score becomes. At some point, another LRU will
> have the top score, and that will be the target to
> scan.

So you think we shouldn't do the full round over memcgs in shrink_zone a
and rather do it oom way to pick up a victim and hammer it?

> We can adjust the score for different LRUs in different
> ways, eg.:
> - swappiness adjustment for file vs anon LRUs, within
>   an LRU set
> - if an LRU set contains a file LRU with more inactive
>   than active pages, reclaim from this LRU set first
> - if an LRU set is over it's soft limit, reclaim from
>   this LRU set first

maybe we could replace LRU size by (LRU size - soft_limit) in the above
formula?

> 
> This also gives us a nice way to balance memory pressure
> between zones, etc...

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]