Re: [PATCH 2/3] mm: page allocator: Calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Fri, 3 Sep 2010 16:28:21 -0700

On Fri, 3 Sep 2010 18:17:46 -0500 (CDT)
Christoph Lameter <cl@xxxxxxxxx> wrote:

> On Fri, 3 Sep 2010, Andrew Morton wrote:
> 
> > Can someone remind me why per_cpu_pageset went and reimplemented
> > percpu_counters rather than just using them?
> 
> The vm counters are per zone and per cpu and have a flow from per cpu /
> zone deltas to zone counters and then also into global counters.

hm.  percpu counters would require overflow-time hooks to do that. 
Might be worth looking at.

> > Is this really the best way of doing it?  The way we usually solve
> > this problem (and boy, was this bug a newbie mistake!) is:
> >
> > 	foo = percpu_counter_read(x);
> >
> > 	if (foo says something bad) {
> > 		/* Bad stuff: let's get a more accurate foo */
> > 		foo = percpu_counter_sum(x);
> > 	}
> >
> > 	if (foo still says something bad)
> > 		do_bad_thing();
> >
> > In other words, don't do all this stuff with percpu_drift_mark and the
> > kswapd heuristic.  Just change zone_watermark_ok() to use the more
> > accurate read if it's about to return "no".
> 
> percpu counters must always be added up when their value is determined.

Nope.  That's the difference between percpu_counter_read() and
percpu_counter_sum().

> This seems to be a special case here where Mel does not want to have to
> cost to bring the counters up to date nor reduce the delta/time limits to
> get some more accuracy but wants take some sort of snapshot of the whole
> situation for this particular case.

My suggestion didn't actually have anything to do with percpu_counters.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>