Re: [RFC v2 4/7] change kernel accounting to include steal time

Glauber Costa <glommer@xxxxxxxxxx> · Thu, 2 Sep 2010 15:19:56 -0300



On Tue, Aug 31, 2010 at 10:11:49AM +0200, Peter Zijlstra wrote:
> On Mon, 2010-08-30 at 19:03 -0400, Rik van Riel wrote:
> > 
> > > I think it basically comes down to adding "sched_clock_unstolen()" which
> > > the scheduler can use to measure time a process spends running, and
> > > sched_clock() for measuring sleep times.  In the normal case,
> > > sched_clock_unstolen() would be the same as sched_clock().
> > 
> > That requires the host to export (any time the guest is scheduled
> > in), the amount of CPU time the VCPU thread has used, and the time
> > the VCPU was scheduled in.
> > 
> > Since the VCPU must be running when it is examining these variables,
> > it can calculate the additional time (since it was last scheduled)
> > to account to the task, and remember the currently calculated time
> > in its own per-vcpu variable, so next time it can get a delta again. 
> 
> I think its easier (and sufficient) for the host to tell the guest how
> long it was _not_ running. That can simply be passed in when you start
> the vcpu again and doesn't need a fancy communication channel.
> 
> The guests sched_clock() will measure wall time, the guests
> sched_clock_stolen() will report the accumulation of these stolen times.
> 
> Then you can make sched_clock_unstolen() be sched_clock() -
> sched_clock_stolen(). And like Jeremy said, if you make the sched_fair
> stuff use sched_clock_unstolen() things should more or less work.
So what's the big drawback of just making sched_clock return sched_clock_unstolen?
When there is no steal time involved, they will just be equal anyway.
And this way, everybody that relies on sched_clock for whatever reason,
will probably work.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html