On Thu, Aug 26, 2010 at 05:28:56PM -0300, Glauber Costa wrote: > On Thu, Aug 26, 2010 at 02:23:03PM -0300, Marcelo Tosatti wrote: > > On Wed, Aug 25, 2010 at 05:43:14PM -0400, Glauber Costa wrote: > > > This patch proposes a common steal time implementation. When no > > > steal time is accounted, we just add a branch to the current > > > accounting code, that shouldn't add much overhead. > > > > > > When we do want to register steal time, we proceed as following: > > > - if we would account user or system time in this tick, and there is > > > out-of-cpu time registered, we skip it altogether, and account steal > > > time only. > > > - if we would account user or system time in this tick, and we got the > > > cpu for the whole slice, we proceed normaly. > > > - if we are idle in this tick, we flush out-of-cpu time to give it the > > > chance to update whatever last-measure internal variable it may have. > > > > Problem of using sched notifiers is that you don't differentiate whether > > the vcpu scheduled out by its own (via hlt emulation) or not. > And we don't need to. If we're out because we want to, we're idle. > And so, we don't account steal time. Think of the program below. > > Skipping accounting of user/system time whenever there's any stolen > > time detected probably breaks u/s accounting on non-cpu-hog loads. > I am willing to test some workloads you can suggest, but right now, > (yeah, I mostly used cpu-hogs), this scheme worked better. > > Linux does statistical sampling for accounting anyway, so I don't see > it getting much worse. A "cpu hog" that sleeps 1us every 1ms. > > I suppose steal time should be accounted separately from u/s ticks, as > > Xen does. > It requires us to hook somewhere else, which I deem as overcomplicated. > Do you have any suggestion on how to make it simple? Unfortunately no. > Furthermore, "doing separate", is equivalent of not skipping user/system, > if we really prefer to. > > > + if (delta > 1000UL) > > + touch_softlockup_watchdog(); > > + > > > > This will break authentic soft lockup detection whenever qemu processing > > takes more than 1s. > > This should be 10s. 1000UL is a typo. Comment is still valid. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html