On 6/5/19 12:58 AM, Vlastimil Babka wrote: > On 6/5/19 1:30 AM, Mike Kravetz wrote: >> While looking at some really long hugetlb page allocation times, I noticed >> instances where should_compact_retry() was returning true more often that >> I expected. In one allocation attempt, it returned true 765668 times in a >> row. To me, this was unexpected because of the following: >> >> #define MAX_COMPACT_RETRIES 16 >> int max_retries = MAX_COMPACT_RETRIES; >> >> However, if should_compact_retry() returns true via the following path we >> do not increase the retry count. >> >> /* >> * make sure the compaction wasn't deferred or didn't bail out early >> * due to locks contention before we declare that we should give up. >> * But do not retry if the given zonelist is not suitable for >> * compaction. >> */ >> if (compaction_withdrawn(compact_result)) { >> ret = compaction_zonelist_suitable(ac, order, alloc_flags); >> goto out; >> } >> >> Just curious, is this intentional? > > Hmm I guess we didn't expect compaction_withdrawn() to be so > consistently returned. Do you know what value of compact_result is there > in your test? Added some instrumentation to record values and ran test, 557904 Total 549186 COMPACT_DEFERRED 8718 COMPACT_PARTIAL_SKIPPED Do note that this is not my biggest problem with these allocations. That is should_continue_reclaim returning true more often that in should. Still trying to get more info on that. This was just something curious I also discovered. -- Mike Kravetz