On 09/24/2012 05:52 PM, Peter Zijlstra wrote: > On Mon, 2012-09-24 at 17:43 +0200, Avi Kivity wrote: >> Wouldn't this correspond to the scheduler interrupt firing and causing a >> reschedule? I thought the timer was programmed for exactly the point in >> time that CFS considers the right time for a switch. But I'm basing >> this on my mental model of CFS, not CFS itself. > > No, we tried this for hrtimer kernels for a while, but programming > hrtimers the whole time (every actual task-switch) turns out to be far > too expensive. So we're back to HZ ticks and 'polling' the preemption > state. Ok, so I wasn't completely off base. With HZ=1000, we can only be faster than the poll by a millisecond than the interrupt-driven schedule(), and we need to be a lot faster. > Even if we remove all the hrtimer infrastructure overhead (can do with a > few hacks) setting the hardware requires going out to the LAPIC, which > is stupid slow. > > Some hardware actually has fast/reliable/usable timers, sadly none of it > is popular. There is the TSC deadline timer mode of newer Intels. Programming the timer is a simple wrmsr, and it will fire immediately if it already expired. Unfortunately on AMDs it is not available, and on virtual hardware it will be slow (~1-2 usec). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html