On 09/25/2012 10:09 AM, Raghavendra K T wrote: > On 09/24/2012 09:36 PM, Avi Kivity wrote: >> On 09/24/2012 05:41 PM, Avi Kivity wrote: >>> >>>> >>>> case 2) >>>> rq1 : vcpu1->wait(lockA) (spinning) >>>> rq2 : vcpu3 (running) , vcpu2->holding(lockA) [scheduled out] >>>> >>>> I agree that checking rq1 length is not proper in this case, and as >>>> you >>>> rightly pointed out, we are in trouble here. >>>> nr_running()/num_online_cpus() would give more accurate picture here, >>>> but it seemed costly. May be load balancer save us a bit here in not >>>> running to such sort of cases. ( I agree load balancer is far too >>>> complex). >>> >>> In theory preempt notifier can tell us whether a vcpu is preempted or >>> not (except for exits to userspace), so we can keep track of whether >>> it's we're overcommitted in kvm itself. It also avoids false positives >>> from other guests and/or processes being overcommitted while our vm >>> is fine. >> >> It also allows us to cheaply skip running vcpus. > > Hi Avi, > > Could you please elaborate on how preempt notifiers can be used > here to keep track of overcommit or skip running vcpus? > > Are we planning set some flag in sched_out() handler etc? > Keep a bitmap kvm->preempted_vcpus. In sched_out, test whether we're TASK_RUNNING, and if so, set a vcpu flag and our bit in kvm->preempted_vcpus. On sched_in, if the flag is set, clear our bit in kvm->preempted_vcpus. We can also keep a counter of preempted vcpus. We can use the bitmap and the counter to quickly see if spinning is worthwhile (if the counter is zero, better to spin). If not, we can use the bitmap to select target vcpus quickly. The only problem is that in order to keep this accurate we need to keep the preempt notifiers active during exits to userspace. But we can prototype this without this change, and add it later if it works. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html