On Tue, Sep 27, 2016 at 09:41:48PM -0400, Johannes Weiner wrote: > Hi guys, > > we noticed what looks like a regression in page mobility grouping > during an upgrade from 3.10 to 4.0. Identical machines, workloads, and > uptime, but /proc/pagetypeinfo on 3.10 looks like this: > > Number of blocks type Unmovable Reclaimable Movable Reserve Isolate > Node 1, zone Normal 815 433 31518 2 0 > > and on 4.0 like this: > > Number of blocks type Unmovable Reclaimable Movable Reserve CMA Isolate > Node 1, zone Normal 3880 3530 25356 2 0 0 > Unmovable pageblocks is not necessarily related to the number of unmovable pages in the system although it is obviously a concern. Basically there are two usual approaches to investigating this -- close attention to the extfrag tracepoint and analysing high-order allocation failures. It's drastic, but when migration grouping was first implemented it was necessary to use a variation of PAGE_OWNER to walk the movable pageblocks identifying unmovable allocations in there. I also used to have a debugging patch that would print out the owner of all pages that failed to migrate within an unmovable block. Unfortunately I don't have these patches any more and they wouldn't apply anyway but it'd be easier to implement today than it was 7-8 years ago. > 4.0 is either polluting pageblocks more aggressively at allocation, or > is not able to make pageblocks movable again when the reclaimable and > unmovable allocations are released. Invoking compaction manually > (/proc/sys/vm/compact_memory) is not bringing them back, either. > > The problem we are debugging is that these machines have a very high > rate of order-3 allocations (fdtable during fork, network rx), and > after the upgrade allocstalls have increased dramatically. I'm not > entirely sure this is the same issue, since even order-0 allocations > are struggling, but the mobility grouping in itself looks problematic. > Network RX is likely to be atomic allocations. Another potentially place to focus on is the use of HighAtomic pageblocks and either increasing them in size or protecting them more aggressively. > I'm still going through the changes relevant to mobility grouping in > that timeframe, but if this rings a bell for anyone, it would help. I > hate blaming random patches, but these caught my eye: > > 9c0415e mm: more aggressive page stealing for UNMOVABLE allocations > 3a1086f mm: always steal split buddies in fallback allocations > 99592d5 mm: when stealing freepages, also take pages created by splitting buddy page > > The changelog states that by aggressively stealing split buddy pages > during a fallback allocation we avoid subsequent stealing. But since > there are generally more movable/reclaimable pages available, and so > less falling back and stealing freepages on behalf of movable, won't > this mean that we could expect exactly that result - growing numbers > of unmovable blocks, while rarely stealing them back in movable alloc > fallbacks? And the expansion of !MOVABLE blocks would over time make > compaction less and less effective too, seeing as it doesn't consider > anything !MOVABLE suitable migration targets? > It's a solid theory. There has been a lot of activity to weaken fragmentation avoidance protection to reduce latency. Unfortunately external fragmentation continues to be one of those topics that is very difficult to precisely define because it's a matter of definition whether it's important or not. Another avenue worth considering is that compaction used to scan unmovable pageblocks and migrate movable pages out of there but that was weakened over time trying to allocate THP pages from direct allocation context quickly enough. I'm not exactly sure what we do there at the moment and whether kcompactd cleans unmovable pageblocks or not. It takes time but it also reduces unmovable pageblock steals over time (or at least it did a few years ago when I last investigated this in depth). Unfortunately I do not have any suggestions offhand on how it could be easily improved without going back to first principals and identifying what pages end up in awkward positions, why and whether the cost of "cleaning" unmovable pageblocks during compaction for a high-order allocation is justified or not. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>