On Thu, Jul 21, 2011 at 12:36:11PM -0400, Andrew Lutomirski wrote: > On Thu, Jul 21, 2011 at 12:24 PM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > > On Thu, Jul 21, 2011 at 05:09:59PM +0100, Mel Gorman wrote: > >> On Fri, Jul 22, 2011 at 12:37:22AM +0900, Minchan Kim wrote: > >> > On Fri, Jun 24, 2011 at 03:44:53PM +0100, Mel Gorman wrote: > >> > > (Built this time and passed a basic sniff-test.) > >> > > > >> > > During allocator-intensive workloads, kswapd will be woken frequently > >> > > causing free memory to oscillate between the high and min watermark. > >> > > This is expected behaviour. Unfortunately, if the highest zone is > >> > > small, a problem occurs. > >> > > > >> > > This seems to happen most with recent sandybridge laptops but it's > >> > > probably a co-incidence as some of these laptops just happen to have > >> > > a small Normal zone. The reproduction case is almost always during > >> > > copying large files that kswapd pegs at 100% CPU until the file is > >> > > deleted or cache is dropped. > >> > > > >> > > The problem is mostly down to sleeping_prematurely() keeping kswapd > >> > > awake when the highest zone is small and unreclaimable and compounded > >> > > by the fact we shrink slabs even when not shrinking zones causing a lot > >> > > of time to be spent in shrinkers and a lot of memory to be reclaimed. > >> > > > >> > > Patch 1 corrects sleeping_prematurely to check the zones matching > >> > > the classzone_idx instead of all zones. > >> > > > >> > > Patch 2 avoids shrinking slab when we are not shrinking a zone. > >> > > > >> > > Patch 3 notes that sleeping_prematurely is checking lower zones against > >> > > a high classzone which is not what allocators or balance_pgdat() > >> > > is doing leading to an artifical believe that kswapd should be > >> > > still awake. > >> > > > >> > > Patch 4 notes that when balance_pgdat() gives up on a high zone that the > >> > > decision is not communicated to sleeping_prematurely() > >> > > > >> > > This problem affects 2.6.38.8 for certain and is expected to affect > >> > > 2.6.39 and 3.0-rc4 as well. If accepted, they need to go to -stable > >> > > to be picked up by distros and this series is against 3.0-rc4. I've > >> > > cc'd people that reported similar problems recently to see if they > >> > > still suffer from the problem and if this fixes it. > >> > > > >> > > >> > Good! > >> > This patch solved the problem. > >> > But there is still a mystery. > >> > > >> > In log, we could see excessive shrink_slab calls. > >> > >> Yes, because shrink_slab() was called on each loop through > >> balance_pgdat() even if the zone was balanced. > >> > >> > >> > And as you know, we had merged patch which adds cond_resched where last of the function > >> > in shrink_slab. So other task should get the CPU and we should not see > >> > 100% CPU of kswapd, I think. > >> > > >> > >> cond_resched() is not a substitute for going to sleep. > > > > Of course, it's not equal with sleep but other task should get CPU and conusme their time slice > > So we should never see 100% CPU consumption of kswapd. > > No? > > If the rest of the system is idle, then kswapd will happily use 100% > CPU. (Or on a multi-core system, kswapd will use close to 100% of one Of course. But at least, we have a test program and I think it's not idle. > CPU even if another task is using the other one. This is bad enough > on a desktop, but on a laptop you start to notice when your battery Of course it's bad. :) What I want to know is just what's exact cause of 100% CPU usage. It might be not 100% but we might use the word sloppily. > dies.) > > --Andy > > > > >> > >> -- > >> Mel Gorman > >> SUSE Labs > > > > -- > > Kind regards, > > Minchan Kim > > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>