Re: [RFC v2 0/7] kvm stael time implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 On 08/30/2010 09:06 AM, Glauber Costa wrote:
> Hi,
>
> So, this is basically the same as v1, with three major
> differences: 
>  1) I am posting to lkml for wider audience
>  2) patch 2/7 fixes one problem I mentined would happen in
>     smp systems, which is, we only update kvmclock when we
>     changes pcup
>  3) softlockup algorithm is changed. Again, as marcelo pointed
>     out, this is open to discussion, and I am not dropping it
>     so more people can step in.
>
> I have some other patches under local test for a slightly modified
> guest part accounting, and I do somehow support extending
> the interface, and changing to nsecs (maybe not 100 %, but...). But
> I am posting in this state so we can have lkml people to step
> in earlier.
>
> Reminder of the previous cover-letter:
>
> There are two parts of it: the guest and host part.
>
> The proposal for the guest part, is to just change the
> common time accounting, and try to identify at that spot,
> wether or not we should account any steal time. I considered
> this idea less cumbersome that trying to cook a clockevents
> implementation ourselves, since I see little value in it.
> I am, however, pretty open to suggestions.

What's the relationship between clockevents and stolen time?  Are you
thinking some form of timer that counts unstolen time or something?

> For the host<->guest communications, I am using a shared
> page, in the same way as pvclock. Because of that, I am just
> hijacking pvclock structure anyway. There is a 32-bit field
> floating by, that gives us enough room for 8 years of steal
> time (we use msec resolution).

Please don't.  The pvclock structure is already getting a bit packed
with stuff, and stolen time isn't really part of its job.  In Xen we
have a separate runstate structure which allows the guest to get a
detailed breakdown of the time each vcpu spends in each state (which are
guaranteed to sum to the system time).  We can use that to compute how
much time has been stolen (time spent in RUNNABLE state).  You might
consider a similar ABI for KVM, even if you can't (yet) fill out all the
time values.


> The main idea is to timestamp our exit and entry through
> sched notifiers, and export the value at pvclock updates.
> This obviously have some disadvantages: by doing this we
> are giving up futures ideas about only updating
> this structure once, and even right now, won't work
> on pinned-smp (since we don't update pvclock if we
> haven't changed cpus.)
>
> But again, it is just an RFC, and I'd like to feel the
> reception of the idea as a whole.

The Xen code has always accounted for stolen time, so it appears in top,
vmstat, etc, and gives a user/admin some idea about how much their
domain is suffering from competition.  This doesn't require any kernel
changes aside from some calls to account_steal_ticks(); we do this every
timer interrupt, accumulating whole ticks as we get them.

But I've not successfully managed to make the scheduler work well with
stolen time.  I did experiment with making sched_clock return unstolen
time, on the grounds that it would give the scheduler more information
about how long things actually executed for.  But its meaningless for
measuring sleep/idle times, and it causes the different cpus' timebases
to drift severely, which causes other things in the kernel to get upset.

So I think any work in the scheduler area is interesting, and probably
worth posting separately unless they're too entangled in the stolen time
accounting area.

    J

> Have a nice review.
> Glauber Costa (7):
>   change headers preparing for steal time
>   always call kvm_write_guest
>   measure time out of guest
>   change kernel accounting to include steal time
>   kvm steal time implementation
>   touch softlockup watchdog
>   tell guest about steal time feature
>
>  arch/x86/include/asm/kvm_host.h    |    2 +
>  arch/x86/include/asm/kvm_para.h    |    1 +
>  arch/x86/include/asm/pvclock-abi.h |    4 ++-
>  arch/x86/kernel/kvmclock.c         |   40 ++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c                 |   26 ++++++++++++++++++----
>  include/linux/sched.h              |    1 +
>  kernel/sched.c                     |   29 ++++++++++++++++++++++++++
>  7 files changed, 97 insertions(+), 6 deletions(-)
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux