> The concern I have is that even though we have gone through changes to > help reduce the candidate vcpus we yield to, we still have a very poor > idea of which vcpu really needs to run. The result is high cpu usage in > the get_pid_task and still some contention in the double runqueue lock. > To make this scalable, we either need to significantly reduce the > occurrence of the lock-holder preemption, or do a much better job of > knowing which vcpu needs to run (and not unnecessarily yielding to vcpus > which do not need to run). The patches that Raghavendra has been posting do accomplish that. > > On reducing the occurrence: The worst case for lock-holder preemption > is having vcpus of same VM on the same runqueue. This guarantees the > situation of 1 vcpu running while another [of the same VM] is not. To > prove the point, I ran the same test, but with vcpus restricted to a > range of host cpus, such that any single VM's vcpus can never be on the > same runqueue. In this case, all 10 VMs' vcpu-0's are on host cpus 0-4, > vcpu-1's are on host cpus 5-9, and so on. Here is the result: > > kvm_cpu_spin, and all > yield_to changes, plus > restricted vcpu placement: 8823 +/- 3.20% much, much better > > On picking a better vcpu to yield to: I really hesitate to rely on > paravirt hint [telling us which vcpu is holding a lock], but I am not > sure how else to reduce the candidate vcpus to yield to. I suspect we > are yielding to way more vcpus than are prempted lock-holders, and that > IMO is just work accomplishing nothing. Trying to think of way to > further reduce candidate vcpus.... ... the patches are posted - you could try them out? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html