On Thu 10-02-11 10:21:10, Mel Gorman wrote: > On Wed, Feb 09, 2011 at 07:28:46PM +0100, Andrea Arcangeli wrote: > > On Wed, Feb 09, 2011 at 04:46:56PM +0000, Mel Gorman wrote: > > > On Wed, Feb 09, 2011 at 04:46:06PM +0100, Johannes Weiner wrote: > > > > Hi, > > > > > > > > I think this should fix the problem of processes getting stuck in > > > > reclaim that has been reported several times. > > > > > > I don't think it's the only source but I'm basing this on seeing > > > constant looping in balance_pgdat() and calling congestion_wait() a few > > > weeks ago that I haven't rechecked since. However, this looks like a > > > real fix for a real problem. > > > > Agreed. Just yesterday I spent some time on the lumpy compaction > > changes after wondering about Michal's khugepaged 100% report, and I > > expected some fix was needed in this area (as I couldn't find any bug > > in khugepaged yet, so the lumpy compaction looked the next candidate > > for bugs). > > > > Michal did report that disabling defrag did not help but the stack trace > also showed that it was stuck in shrink_zone() which is what Johannes' > patch targets. It's not unreasonable to test if Johannes' patch solves > Michal's problem. Michal, I know that your workload is a bit random and > may not be reproducible but do you think it'd be possible to determine > if Johannes' patch helps? Sure, I can test it. Nevertheless, I haven't seen the problem again. I have tried to make some memory pressure on the machine but no "luck". -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>