On Fri, Aug 19, 2016 at 03:23:20PM +0200, Vlastimil Babka wrote: > What's that? Never head of this before, but sounds scary :) I thought > that zone_reclaim itself was rather discouraged nowadays, not a big > candidate for further improvement.,, It's some fix that I tried to push upstream but wasn't merged. I kept maintaining it because I got customers bugreport about THP causing regressions to node_reclaim. Hard NUMA bindings would solve that but apparently there are apps that prefers no memory binding to allow flexible spillover, and they only use CPU bindings only but with a strong NUMA bias provided by node_reclaim, by shrinking the cache (and only the cache). In any case it was a regression caused by THP because compaction wasn't invoked. Note zone_reclaim has a synchronous more aggressive option that blocks for write back if needed, so invoking direct compaction there is sure ok, if it's asked on demand. As usual it's always a tradeoff between long live and short lived allocation so if you reserve a system for computations and you know your allocation are very long lived it make perfect sense to be aggressive if you tune for it. zone_reclaim or synchronous direct compaction are obviously bad defaults for general purpose default settings, it doesn't mean it should be impossible to tune a system for a certain workload to run optimal. > Hm I'm not so sure. Are all movable allocations highmem? For example > Joonsoo mentions in his ZONE_CMA patchset "blockdev file cache page > [...] usually has __GFP_MOVABLE but not __GFP_HIGHMEM and __GFP_USER". > Now we also have Minchan's infrastructure for arbitrary driver > compaction, so those will be movable, but potentially still restricted > to e.g. DMA32... One option is to forbid such corner cases... and VM_WARN_ON (not a typo :) available in my tree) if __GFP_MOVABLE is passed on lower classzones. The other option would be to have a per-classzone lowpfn, highpnf scan pointers. That has some cons but hey this whole thing is a tradeoff isn't it? It's about the fact we're optimizing for less frequent lowmem allocations so we can as well provide a worse compaction for lowmem (by reducing the MOVABLE memory restricted to lower classzones like mentioned above), but leverage the node model to have a more powerful that crosses all zone boundaries, when the GFP_HIGHUSER is used. I don't see why the tradeoff is valid when it comes to the LRU but not valid when it comes to compaction and then I've to do a blind loop of (for-each-zone-in-the-node-in-reverse { compact_zone_order(zone) }) which works worse than before and works worse than a zone-boundary-less compaction based on the node model. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>