Re: [PATCH -v2 2/2] make the compaction "skip ahead" logic robust

Rik van Riel <riel@xxxxxxxxxx> · Mon, 17 Sep 2012 09:50:08 -0400

On 09/15/2012 11:55 AM, Richard Davies wrote:
Hi Rik, Mel and Shaohua,

Thank you for your latest patches. I attach my latest perf report for a slow
boot with all of these applied.

Mel asked for timings of the slow boots. It's very hard to give anything
useful here! A normal boot would be a minute or so, and many are like that,
but the slowest that I have seen (on 3.5.x) was several hours. Basically, I
just test many times until I get one which is noticeably slow than normal
and then run perf record on that one.

The latest perf report for a slow boot is below. For the fast boots, most of
the time is in clean_page_c in do_huge_pmd_anonymous_page, but for this slow
one there is a lot of lock contention above that.
How often do you run into slow boots, vs. fast ones?

# Overhead          Command         Shared Object                                          Symbol
# ........  ...............  ....................  ..............................................
#
     58.49%         qemu-kvm  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
                    |
                    --- _raw_spin_lock_irqsave
                       |
                       |--95.07%-- compact_checklock_irqsave
                       |          |
                       |          |--70.03%-- isolate_migratepages_range
                       |          |          compact_zone
                       |          |          compact_zone_order
                       |          |          try_to_compact_pages
                       |          |          __alloc_pages_direct_compact
                       |          |          __alloc_pages_nodemask
Looks like it moved from isolate_freepages_block in your last
trace, to isolate_migratepages_range?

Mel, I wonder if we have any quadratic complexity problems
in this part of the code, too?

The isolate_freepages_block CPU use can be fixed by simply
restarting where the last invocation left off, instead of
always starting at the end of the zone.  Could we need
something similar for isolate_migratepages_range?

After all, Richard has a 128GB system, and runs 108GB worth
of KVM guests on it...

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html