On Mon, 5 May 2014, Vlastimil Babka wrote: > OK that's due to my commit 50b5b094e6 ("mm: compaction: do not mark unmovable > pageblocks as skipped in async compaction") and the intention was to avoid > marking pageblocks as to-be-skipped just because they were ignored by async > compaction, which would make the following sync compaction ignore them as > well. However it's true that update_pageblock_skip() also updates the cached > pfn's and not updating them is a sideeffect of this change. > It's not necessary just that commit, update_pageblock_skip() won't do anything if cc->finished_update_migrate is true which still happens before the commit. This issue was noticed on a kernel without your commit. > I didn't think that would be a problem as skipping whole pageblocks due to > being non-movable should be fast and without taking locks. But if your testing > shows that this is a problem, then OK. > Async compaction terminates early when lru_lock is contended or need_resched() and on zones that are so large for a 128GB machine, this happens often. A thp allocation returns immediately because of contended_compaction in the page allocator. When the next thp is allocated, async compaction starts from where the former iteration started because we don't do any caching of the pfns and nothing called sync compaction. It's simply unnecessary overhead that can be prevented on the next call and it leaves a potentially large part of the zone unscanned if we continuously fail because of contention. This patch fixes that. > > This patch adds a per-zone cached migration scanner pfn only for async > > compaction. It is updated everytime a pageblock has been scanned in its > > entirety and when no pages from it were successfully isolated. The cached > > migration scanner pfn for sync compaction is updated only when called for > > sync > > compaction. > > I think this might be an overkill and maybe just decoupling the cached pfn > update from the update_pageblock_skip() would be enough, i.e. restore > pre-50b5b094e6 behavior for the cached pfn (but not for the skip bits)? I > wonder if your new sync migration scanner would make any difference. > Presumably when async compaction finishes without success by having the > scanners meet, compact_finished() will reset the cached pfn's and the sync > compaction will not have a chance to use any previously cached value anyway? > When a zone has 32GB or 64GB to scan, as is in this case (and will become larger in the future), async compaction will always terminate early. It may never cause a migration destination page to even be allocated, the freeing scanner may never move and there's no guarantee they will ever meet if we never call sync compaction. The logic presented in this patch will avoid rescanning the non-movable pageblocks, for example, for async compaction until all other memory has been scanned. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>