On Fri, Jun 26, 2015 at 03:14:39AM +0900, Joonsoo Kim wrote: > > It could though. Reclaim/compaction is entered for orders higher than > > PAGE_ALLOC_COSTLY_ORDER and when scan priority is sufficiently high. > > That could be adjusted if you have a viable case where orders < > > PAGE_ALLOC_COSTLY_ORDER must succeed and currently requires excessive > > reclaim instead of relying on compaction. > > Yes. I saw this problem in real situation. In ARM, order-2 allocation > is requested > in fork(), so it should be succeed. But, there is not enough order-2 freepage, > so reclaim/compaction begins. Compaction fails repeatedly although > I didn't check exact reason. That should be identified and repaired prior to reimplementing compaction because it's important. > >> >> 3) Compaction capability is highly depends on migratetype of memory, > >> >> because freepage scanner doesn't scan unmovable pageblock. > >> >> > >> > > >> > For a very good reason. Unmovable allocation requests that fallback to > >> > other pageblocks are the worst in terms of fragmentation avoidance. The > >> > more of these events there are, the more the system will decay. If there > >> > are many of these events then a compaction benchmark may start with high > >> > success rates but decay over time. > >> > > >> > Very broadly speaking, the more the mm_page_alloc_extfrag tracepoint > >> > triggers with alloc_migratetype == MIGRATE_UNMOVABLE, the faster the > >> > system is decaying. Having the freepage scanner select unmovable > >> > pageblocks will trigger this event more frequently. > >> > > >> > The unfortunate impact is that selecting unmovable blocks from the free > >> > csanner will improve compaction success rates for high-order kernel > >> > allocations early in the lifetime of the system but later fail high-order > >> > allocation requests as more pageblocks get converted to unmovable. It > >> > might be ok for kernel allocations but THP will eventually have a 100% > >> > failure rate. > >> > >> I wrote rationale in the patch itself. We already use non-movable pageblock > >> for migration scanner. It empties non-movable pageblock so number of > >> freepage on non-movable pageblock will increase. Using non-movable > >> pageblock for freepage scanner negates this effect so number of freepage > >> on non-movable pageblock will be balanced. Could you tell me in detail > >> how freepage scanner select unmovable pageblocks will cause > >> more fragmentation? Possibly, I don't understand effect of this patch > >> correctly and need some investigation. :) > >> > > > > The long-term success rate of fragmentation avoidance depends on > > minimsing the number of UNMOVABLE allocation requests that use a > > pageblock belonging to another migratetype. Once such a fallback occurs, > > that pageblock potentially can never be used for a THP allocation again. > > > > Lets say there is an unmovable pageblock with 500 free pages in it. If > > the freepage scanner uses that pageblock and allocates all 500 free > > pages then the next unmovable allocation request needs a new pageblock. > > If one is not completely free then it will fallback to using a > > RECLAIMABLE or MOVABLE pageblock forever contaminating it. > > Yes, I can imagine that situation. But, as I said above, we already use > non-movable pageblock for migration scanner. While unmovable > pageblock with 500 free pages fills, some other unmovable pageblock > with some movable pages will be emptied. Number of freepage > on non-movable would be maintained so fallback doesn't happen. > > Anyway, it is better to investigate this effect. I will do it and attach > result on next submission. > Lets say we have X unmovable pageblocks and Y pageblocks overall. If the migration scanner takes movable pages from X then there is more space for unmovable allocations without having to increase X -- this is good. If the free scanner uses the X pageblocks as targets then they can fill. The next unmovable allocation then falls back to another pageblock and we either have X+1 unmovable pageblocks (full steal) or a mixed pageblock (partial steal) that cannot be used for THP. Do this enough times and X == Y and all THP allocations fail. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>