Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages

Mike Waychison <mikew@xxxxxxxxxx> · Fri, 20 Apr 2012 19:48:49 -0700

On Fri, Apr 20, 2012 at 7:29 PM, Takuya Yoshikawa
<takuya.yoshikawa@xxxxxxxxx> wrote:
> On Fri, 20 Apr 2012 19:15:24 -0700
> Mike Waychison <mikew@xxxxxxxxxx> wrote:
>
>> In our situation, we simple disable the shrinker altogether.  My
>> understanding is that we EPT or NPT, the amount of memory used by
>> these tables is bounded by the size of guest physical memory, whereas
>> with software shadowed tables, it is bounded by the addresses spaces
>> in the guest.  This bound makes it reasonable to not do any reclaim
>> and charge it as a "system overhead tax".
>
> IIRC, KVM's mmu_shrink is mainly for protecting the host from pathological
> guest without EPT or NPT.
>
> You can see Avi's summary: -- http://www.spinics.net/lists/kvm/msg65671.html
> ===
> We should aim for the following:
> - normal operation causes very little shrinks (some are okay)
> - high pressure mostly due to kvm results in kvm being shrunk (this is a
> pathological case caused by a starting a guest with a huge amount of
> memory, and mapping it all to /dev/zero (or ksm), and getting the guest
> the create shadow mappings for all of it)
> - general high pressure is shared among other caches like dcache and icache
>
> The cost of reestablishing an mmu page can be as high as half a
> millisecond of cpu time, which is the reason I want to be conservative.

To add to that, on these systems (32-way), the fault itself isn't as
heavy-handed as a global lock in everyone's reclaim path :)

I'd be very happy if this stuff was memcg aware, but until that
happens, this code is disabled in our production builds.  30% of CPU
time lost to a spinlock when mixing VMs with IO is worth paying the <
1% of system ram these pages cost if it means
tighter/more-deterministic service latencies.

> ===
>
> Thanks,
>        Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html