On Sun, Sep 27, 2009 at 04:18:00PM +0200, Avi Kivity wrote: > On 09/27/2009 04:07 PM, Joerg Roedel wrote: >> On Sun, Sep 27, 2009 at 03:47:55PM +0200, Avi Kivity wrote: >> >>> On 09/27/2009 03:46 PM, Joerg Roedel wrote: >>> >>>> >>>>> We can't find exactly which vcpu, but we can: >>>>> >>>>> - rule out threads that are not vcpus for this guest >>>>> - rule out threads that are already running >>>>> >>>>> A major problem with sleep() is that it effectively reduces the vm >>>>> priority relative to guests that don't have spinlock contention. By >>>>> selecting a random nonrunnable vcpu belonging to this guest, we at least >>>>> preserve the guest's timeslice. >>>>> >>>>> >>>> Ok, that makes sense. But before trying that we should probably try to >>>> call just yield() instead of schedule()? I remember someone from our >>>> team here at AMD did this for Xen a while ago and already had pretty >>>> good results with that. Xen has a completly other scheduler but maybe >>>> its worth trying? >>>> >>>> >>> yield() is a no-op in CFS. >>> >> Hmm, true. At least when kernel.sched_compat_yield == 0, which it is on my >> distro. >> If the scheduler would give us something like a real_yield() function >> which asumes kernel.sched_compat_yield = 1 might help. At least its >> better than sleeping for some random amount of time. >> >> > > Depends. If it's a global yield(), yes. If it's a local yield() that > doesn't rebalance the runqueues we might be left with the spinning task > re-running. Only one runable task on each cpu is unlikely in a situation of high vcpu overcommit (where pause filtering matters). > Also, if yield means "give up the reminder of our timeslice", then we > potentially end up sleeping a much longer random amount of time. If we > yield to another vcpu in the same guest we might not care, but if we > yield to some other guest we're seriously penalizing ourselves. I agree that a directed yield with possible rebalance would be good to have, but this is very intrusive to the scheduler code and I think we should at least try if this simpler approach already gives us good results. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html