On Wed, 2011-01-12 at 22:02 -0500, Rik van Riel wrote: > Cgroups only makes the matter worse - libvirt places > each KVM guest into its own cgroup, so a VCPU will > generally always be alone on its own per-cgroup, per-cpu > runqueue! That can lead to pulling a VCPU onto our local > CPU because we think we are alone, when in reality we > share the CPU with others... How can that happen? If the task you're trying to accelerate isn't in your task group, the whole attempt should be a noop. > Removing the pulling code allows me to use all 4 > CPUs with a 4-VCPU KVM guest in an uncontended situation. > > > + /* Tell the scheduler that we'd really like pse to run next. */ > > + p_cfs_rq->next = pse; > > Using set_next_buddy propagates this up to the root, > allowing the scheduler to actually know who we want to > run next when cgroups is involved. > > > + /* We know whether we want to preempt or not, but are we allowed? */ > > + if (preempt&& same_thread_group(p, task_of(p_cfs_rq->curr))) > > + resched_task(task_of(p_cfs_rq->curr)); > > With this in place, we can get into the situation where > we will gladly give up CPU time, but not actually give > any to the other VCPUs in our guest. > > I believe we can get rid of that test, because pick_next_entity > already makes sure it ignores ->next if picking ->next would > lead to unfairness. Preempting everybody who is in your way isn't playing nice neighbor, so I think at least the same_thread_group() test needs to stay. But that's Peter's call. Starting a zillion threads to play wakeup preempt and lets hog the cpu isn't nice either, but it's allowed. > Removing this test (and simplifying yield_to_task_fair) seems > to lead to more predictable test results. Less is more :) -Mike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html