Re: [PATCH 2/3] mm: page allocator: Calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake

Christoph Lameter <cl@xxxxxxxxx> · Fri, 3 Sep 2010 18:17:46 -0500 (CDT)

On Fri, 3 Sep 2010, Andrew Morton wrote:

> Can someone remind me why per_cpu_pageset went and reimplemented
> percpu_counters rather than just using them?

The vm counters are per zone and per cpu and have a flow from per cpu /
zone deltas to zone counters and then also into global counters.

> Is this really the best way of doing it?  The way we usually solve
> this problem (and boy, was this bug a newbie mistake!) is:
>
> 	foo = percpu_counter_read(x);
>
> 	if (foo says something bad) {
> 		/* Bad stuff: let's get a more accurate foo */
> 		foo = percpu_counter_sum(x);
> 	}
>
> 	if (foo still says something bad)
> 		do_bad_thing();
>
> In other words, don't do all this stuff with percpu_drift_mark and the
> kswapd heuristic.  Just change zone_watermark_ok() to use the more
> accurate read if it's about to return "no".

percpu counters must always be added up when their value is determined. We
cannot really affort that for the VM. Counters are always available
without looping over all cpus.

vm counters are continually kept up to date (but may have delta limited by
time and counter values).

This seems to be a special case here where Mel does not want to have to
cost to bring the counters up to date nor reduce the delta/time limits to
get some more accuracy but wants take some sort of snapshot of the whole
situation for this particular case.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>