On Thu, Jul 14, 2016 at 02:23:32PM +0900, Joonsoo Kim wrote: > > > > > > > And, I'd like to know why max() is used for classzone_idx rather than > > > > > min()? I think that kswapd should balance the lowest zone requested. > > > > > > > > > > > > > If there are two allocation requests -- one zone-constraned and the other > > > > zone-unconstrained, it does not make sense to have kswapd skip the pages > > > > usable for the zone-unconstrained and waste a load of CPU. You could > > > > > > I agree that, in this case, it's not good to skip the pages usable > > > for the zone-unconstrained request. But, what I am concerned is that > > > kswapd stop reclaim prematurely in the view of zone-constrained > > > requestor. > > > > It doesn't stop reclaiming for the lower zones. It's reclaiming the LRU > > for the whole node that may or may not have lower zone pages at the end > > of the LRU. If it does, then the allocation request will be satisfied. > > If it does not, then kswapd will think the node is balanced and get > > rewoken to do a zone-constrained reclaim pass. > > If zone-constrained request could go direct reclaim pass, there would > be no problem. But, please assume that request is zone-constrained > without __GFP_DIRECT_RECLAIM which is common for some device driver > implementation. Then it's likely GFP_ATOMIC and it'll wake kswapd on each failure. If kswapd is containtly awake for highmem requests then we're reclaiming everything anyway. Remember that if kswapd is reclaiming for higher zones, it'll still cover the lower zones eventually. There is no guarantee that skipping the highmem pages will satisfy the atomic allocations any faster but consuming the CPU to skip the pages is a definite cost. Even worse, skipping highmem pages when a highmem pages are required may ake lowmem pressure worse because those pages are freed faster and can be consumed by zone-unconstrained requests. If this really is a problem in practice then we can consider having allocation requests that are zone-constrained and !__GFP_DIRECT_RECLAIM set a flag and use the min classzone for the wakeup. That flag remains set until kswapd takes at least one pass using the lower classzone and clears it. The classzone will not be adjusted higher until that flag is cleared. I don't think we should do it without evidence that it's a real problem because kswapd potentially uses useless CPU and the potential for higher lowmem pressure. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>