Re: [RFC v2 4/7] change kernel accounting to include steal time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 On 09/02/2010 11:19 AM, Glauber Costa wrote:
> On Tue, Aug 31, 2010 at 10:11:49AM +0200, Peter Zijlstra wrote:
>> I think its easier (and sufficient) for the host to tell the guest how
>> long it was _not_ running. That can simply be passed in when you start
>> the vcpu again and doesn't need a fancy communication channel.
>>
>> The guests sched_clock() will measure wall time, the guests
>> sched_clock_stolen() will report the accumulation of these stolen times.
>>
>> Then you can make sched_clock_unstolen() be sched_clock() -
>> sched_clock_stolen(). And like Jeremy said, if you make the sched_fair
>> stuff use sched_clock_unstolen() things should more or less work.
> So what's the big drawback of just making sched_clock return sched_clock_unstolen?
> When there is no steal time involved, they will just be equal anyway.
> And this way, everybody that relies on sched_clock for whatever reason,
> will probably work.

I can say from experience that it definitely does not work.  That's what
I tried up until recently when I reverted it all.

It doesn't work for two semi-related reasons:

    * Using sched_clock_unstolen() to measure idle/sleep times is
      completely meaningless (nobody cares if the cpu was "stolen" while
      you were sleeping).
    * The notion of unstolen time is inherently per-cpu, so the unstolen
      clocks are completely unsynchronized between cpus, whereas
      sched_clock is expected to be moderately well synchronized.

A corollary to this is that if a task sleeps on one cpu and wakes on
another, then "sleep_time = sleep_end - sleep_start" is completely,
utterly, meaningless if you use per-cpu unstolen time as your timebase
sleep_start/end.

(The surprising thing is that it actually works surprisingly well, in
that the system runs and there's only occasional oddities like "idle
time" being completely misreported, and sometimes processes go to sleep
for really long times.)

    J
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux