Re: [patch] Revert "memcg: add memory.vmscan_stat"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 30, 2011 at 10:12:33AM +0900, KAMEZAWA Hiroyuki wrote:
> On Mon, 29 Aug 2011 17:51:13 +0200
> Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
> 
> > On Tue, Aug 09, 2011 at 08:33:45AM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 8 Aug 2011 14:43:33 +0200
> > > Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
> > > 
> > > > On Fri, Jul 22, 2011 at 05:15:40PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > > +When under_hierarchy is added in the tail, the number indicates the
> > > > > +total memcg scan of its children and itself.
> > > > 
> > > > In your implementation, statistics are only accounted to the memcg
> > > > triggering the limit and the respectively scanned memcgs.
> > > > 
> > > > Consider the following setup:
> > > > 
> > > >         A
> > > >        / \
> > > >       B   C
> > > >      /
> > > >     D
> > > > 
> > > > If D tries to charge but hits the limit of A, then B's hierarchy
> > > > counters do not reflect the reclaim activity resulting in D.
> > > > 
> > > yes, as I expected.
> > 
> > Andrew,
> > 
> > with a flawed design, the author unwilling to fix it, and two NAKs,
> > can we please revert this before the release?
> 
> How about this ?

> @@ -1710,11 +1711,18 @@ static void mem_cgroup_record_scanstat(s
>  	spin_lock(&memcg->scanstat.lock);
>  	__mem_cgroup_record_scanstat(memcg->scanstat.stats[context], rec);
>  	spin_unlock(&memcg->scanstat.lock);
> -
> -	memcg = rec->root;
> -	spin_lock(&memcg->scanstat.lock);
> -	__mem_cgroup_record_scanstat(memcg->scanstat.rootstats[context], rec);
> -	spin_unlock(&memcg->scanstat.lock);
> +	cgroup = memcg->css.cgroup;
> +	do {
> +		spin_lock(&memcg->scanstat.lock);
> +		__mem_cgroup_record_scanstat(
> +			memcg->scanstat.hierarchy_stats[context], rec);
> +		spin_unlock(&memcg->scanstat.lock);
> +		if (!cgroup->parent)
> +			break;
> +		cgroup = cgroup->parent;
> +		memcg = mem_cgroup_from_cont(cgroup);
> +	} while (memcg->use_hierarchy && memcg != rec->root);

Okay, so this looks correct, but it sums up all parents after each
memcg scanned, which could have a performance impact.  Usually,
hierarchy statistics are only summed up when a user reads them.

I don't get why this has to be done completely different from the way
we usually do things, without any justification, whatsoever.

Why do you want to pass a recording structure down the reclaim stack?
Why not make it per-cpu counters that are only summed up, together
with the hierarchy values, when someone is actually interested in
them?  With an interface like mem_cgroup_count_vm_event(), or maybe
even an extension of that function?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]