On Thu, Aug 16, 2012 at 10:45 PM, Glauber Costa <glommer@xxxxxxxxxxxxx> wrote: > On 08/17/2012 09:46 AM, Ying Han wrote: >> On Thu, Aug 16, 2012 at 10:38 PM, Glauber Costa <glommer@xxxxxxxxxxxxx> wrote: >>> On 08/17/2012 12:53 AM, Ying Han wrote: >>>> This patch moves the main slab shrinking to do_shrink_slab() and restructures >>>> shrink_slab() to walk the memory cgroup hiearchy. The memcg context is embedded >>>> inside the shrink_control. The underling shrinker will be respecting the new >>>> field by only reclaiming slab objects charged to the memcg. >>>> >>>> The hierarchy walk in shrink_slab() is slightly different than the walk in >>>> shrink_zone(), where the latter one walks each memcg once for each priority >>>> under concurrent reclaim threads. It makes less sense for slab since they are >>>> spread out the system instead of per-zone. So here each shrink_slab() will >>>> trigger a full walk of each memcg under the sub-tree. >>>> >>>> One optimization is under global reclaim, where we skip walking the whole tree >>>> but instead pass into shrinker w/ mem_cgroup=NULL. Then it will end up scanning >>>> the full dentry lru list. >>>> >>>> Signed-off-by: Ying Han <yinghan@xxxxxxxxxx> >>>> --- >>>> mm/vmscan.c | 43 +++++++++++++++++++++++++++++++++++-------- >>>> 1 files changed, 35 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index 6ffdff6..7a3a1a4 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -204,7 +204,7 @@ static inline int do_shrinker_shrink(struct shrinker *shrinker, >>>> * >>>> * Returns the number of slab objects which we shrunk. >>>> */ >>>> -unsigned long shrink_slab(struct shrink_control *shrink, >>>> +static unsigned long do_shrink_slab(struct shrink_control *shrink, >>>> unsigned long nr_pages_scanned, >>>> unsigned long lru_pages) >>>> { >>>> @@ -214,12 +214,6 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>>> if (nr_pages_scanned == 0) >>>> nr_pages_scanned = SWAP_CLUSTER_MAX; >>>> >>>> - if (!down_read_trylock(&shrinker_rwsem)) { >>>> - /* Assume we'll be able to shrink next time */ >>>> - ret = 1; >>>> - goto out; >>>> - } >>>> - >>>> list_for_each_entry(shrinker, &shrinker_list, list) { >>>> unsigned long long delta; >>>> long total_scan; >>>> @@ -309,8 +303,41 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>>> >>>> trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr); >>>> } >>>> + >>>> + return ret; >>>> +} >>>> + >>> It seems to me this will call all shrinkers, regardless of whether or >>> not they are memcg-aware. Can't we just skip the ones we know not to be >>> memcg-aware? (basically all non-vfs for the moment...) >>> >>> My fear is that if called, they will shrink. And that may not be what we >>> want. >> >> Are you suggesting to not shrink slabs other than dentry cache? Not >> sure if that is what we want >> neither. However, maybe we can do that for target reclaim though if >> that is what you meant. >> > > If the other shrinkers are not memcg aware, they will end up discarding > random objects that may or may not have anything to do with the group > under pressure, right? The main contributor of the accounted slabs and also reclaimable are vfs objects. Also we know dentry pins inode, so I wonder how bad the problem would be. Do you have specific example on which shrinker could cause the problem? --Ying > This sounds dangerous to the point I'd prefer not touching them at all. > > Obviously, having more memcg-aware shrinkers would void this concern. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>