On Wed, 2011-01-26 at 10:57 +0100, Peter Zijlstra wrote: > On Tue, 2011-01-25 at 19:27 -0200, Glauber Costa wrote: > > On Tue, 2011-01-25 at 22:07 +0100, Peter Zijlstra wrote: > > > On Tue, 2011-01-25 at 18:47 -0200, Glauber Costa wrote: > > > > On Tue, 2011-01-25 at 21:13 +0100, Peter Zijlstra wrote: > > > > > On Tue, 2011-01-25 at 18:02 -0200, Glauber Costa wrote: > > > > > > > > > > > I fail to see how does clock_task influence cpu power. > > > > > > If we also have to touch clock_task for better accounting of other > > > > > > stuff, it is a separate story. > > > > > > But for cpu_power, I really fail. Please enlighten me. > > > > > > > > > > static void update_rq_clock_task(struct rq *rq, s64 delta) > > > > > { > > > > > s64 irq_delta; > > > > > > > > > > irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time; > > > > > > > > > > if (irq_delta > delta) > > > > > irq_delta = delta; > > > > > > > > > > rq->prev_irq_time += irq_delta; > > > > > delta -= irq_delta; > > > > > rq->clock_task += delta; > > > > > > > > > > if (irq_delta && sched_feat(NONIRQ_POWER)) > > > > > sched_rt_avg_update(rq, irq_delta); > > > > > } > > > > > > > > > > its done through that sched_rt_avg_update() (should probably rename > > > > > that), it computes a floating average of time not spend on fair tasks. > > > > > > > > > It creates a dependency on CONFIG_IRQ_TIME_ACCOUNTING, though. > > > > This piece of code is simply compiled out if this option is disabled. > > > > > > We can pull this bit out and make the common bit also available for > > > paravirt. > > > > scale_rt_power() seems to do the right thing, but all the path leading > > to it seem to work on rq->clock, rather than rq->clock_task. > > Not quite, see how rq->clock_task is irq_delta less than the increment > to rq->clock? You want it to be your steal-time delta less too. yes, but once this delta is subtracted from rq->clock_task, this value is not used to dictate power, unless I am mistaken. power is adjusted according to scale_rt_power(), which does it using the values of rq->rt_avg, rq->age_stamp, and rq->clock. So whatever I store into rq->clock_task, but not rq->clock (which correct me if I'm wrong, is expected to be walltime), will not be used to adjust cpu power, which is what I'm trying to achieve. > > Although I do can experiment with that as well, could you please > > elaborate on what are your reasons to prefer this over than variations > > of the method I proposed? > > Because I want rq->clock_task to not include steal-time. Sure, fair deal. But at this point, those demands seem orthogonal to me. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html