On Tue, 3 Jan 2012, Andrew Morton wrote: > On Sat, 31 Dec 2011 23:18:15 -0800 (PST) > Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > On Thu, 29 Dec 2011, Andrew Morton wrote: > > > > > > This is not all some handwavy theoretical thing either. If we've gone > > > and introduced serious latency issues, people *will* hit them and treat > > > it as a regression. > > > > Sure, though the worst I've seen so far (probably haven't been trying > > hard enough yet, I need to go for THPs) is 39 pages freed in one call. > > 39 is OK. How hugepage-intensive was the workload? Not very hugepagey at all. I've since tried harder, and the most I've seen is 523 - I expect you to be more disagreeable about that number! And we should be able to see twice that on i386 without PAE, though I don't suppose there's a vital market for THP in that direction. > > > Regression? Well, any bad latency would already have been there on > > the gathering side. I did check whether similar numbers were coming out of isolate_lru_pages (it could have been that only a hugepage was gathered, but then split into many by the threat of swapping); yes, similar numbers at that end. So using page_list in putback_lru/inactive_pages would not be increasing the worst latency, just doubling its frequency. (Assuming that isolating and putting back have the same cost: my guess is roughly the same, but I've not measured.) > > > > > > Now, a way out here is to remove lumpy reclaim (please). And make the > > > problem not come back by promising to never call putback_lru_pages(lots > > > of pages) (how do we do this?). > > > > We can very easily put a counter in it, doing a spin_unlock_irq every > > time we hit the max. Nothing prevents that, it's just an excrescence > > I'd have preferred to omit and have not today implemented. > > Yes. It's ultra-cautious, but perhaps we should do this at least until > lumpy goes away. I don't think you'll accept my observations above as excuse to do nothing, but please clarify which you think is more cautious. Should I or should I not break up the isolating end in the same way as the putting back? I imagine breaking in every SWAP_CLUSTER_MAX 32, so the common order 0 isn't slowed at all; hmm, maybe add on (1 << PAGE_ALLOC_COSTLY_ORDER) 8 so Kosaki-san's point is respected at least for the uncostly orders. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>