On 06/02/2010 11:50 AM, Andi Kleen wrote:
On Wed, Jun 02, 2010 at 05:51:14AM +0300, Avi Kivity wrote:
On 06/01/2010 08:27 PM, Andi Kleen wrote:
On Tue, Jun 01, 2010 at 07:52:28PM +0300, Avi Kivity wrote:
We are running everything on NUMA (since all modern machines are now NUMA).
At what scale do the issues become observable?
On Intel platforms it's visible starting with 4 sockets.
Can you recommend a benchmark that shows bad behaviour? I'll run it with
Pretty much anything with high lock contention.
Okay, we'll try to measure it here as soon as we can switch it into numa
mode.
Do you have any idea how we can tackle both problems?
Apparently Xen has something, perhaps that can be leveraged
(but I haven't looked at their solution in detail)
Otherwise I would probably try to start with a adaptive
spinlock that at some point calls into the HV (or updates
shared memory?), like john cooper suggested. The tricky part here would
be to find the thresholds and fit that state into
paravirt ops and the standard spinlock_t.
There are two separate problems: the more general problem is that the
hypervisor can put a vcpu to sleep while holding a lock, causing other
vcpus to spin until the end of their time slice. This can only be
addressed with hypervisor help. The second problem is that the extreme
fairness of ticket locks causes lots of context switches if the
hypervisor helps, and aggravates the first problem horribly if it
doesn't (since now a vcpu will spin waiting for its ticket even if the
lock is unlocked).
So yes, we'll need hypervisor assistance, but even with that we'll need
to reduce ticket lock fairness (retaining global fairness but
sacrificing some local fairness). I imagine that will be helpful for
non-virt as well as local unfairness reduces bounciness.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html