On Thu, 28 Oct 2010, Andrew Morton wrote: > > To ensure that kswapd wakes up, a safe version of zone_watermark_ok() > > is introduced that takes a more accurate reading of NR_FREE_PAGES when > > called from wakeup_kswapd, when deciding whether it is really safe to go > > back to sleep in sleeping_prematurely() and when deciding if a zone is > > really balanced or not in balance_pgdat(). We are still using an expensive > > function but limiting how often it is called. > > Here I go again. I have a feeling that I already said this, but I > can't find versions 2 or 3 in the archives.. > > Did you evaluate using plain on percpu_counters for this? They won't > solve the performance problem as they're basically the same thing as > these open-coded counters. But they'd reduce the amount of noise and > custom-coded boilerplate in mm/. The zone counters are done using the ZVCs in vmstat.c to save space and to be in the same cacheline as other hot data necessary for allocation and free. > > > + threshold = max(1, (int)(watermark_distance / num_online_cpus())); > > + > > + /* > > + * Maximum threshold is 125 > > Reasoning? Differentials are stored in 8 bit signed ints. > > + put_online_cpus(); > > +} > > Given that ->stat_threshold is the same for each CPU, why store it for > each CPU at all? Why not put it in the zone and eliminate the inner > loop? Doing that caused cache misses in the past and reduced the performance of the ZVCs. This way the threshold is in the same cacheline as the differentials. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>