On 12/01/2010 02:37 PM, Srivatsa Vaddagiri wrote:
On Wed, Nov 24, 2010 at 04:23:15PM +0200, Avi Kivity wrote:
> >>I'm more concerned about lock holder preemption, and interaction
> >>of this mechanism with any kernel solution for LHP.
> >
> >Can you suggest some scenarios and I'll create some test cases?
> >I'm trying figure out the best way to evaluate this.
>
> Booting 64-vcpu Windows on a 64-cpu host with PLE but without
> directed yield takes longer than forever because PLE detects
> contention within the guest, which under our current PLE
> implementation (usleep(100)) converts guest contention into delays.
Is there any way of optimizing PLE at runtime in such special case? For ex:
either turn off PLE feature or gradually increase (spin-)timeout when PLE should
kick in ..
It's not a special case at all. Both host contention and guest
contention are perfectly normal, and can occur simultaneously.
> (a directed yield implementation would find that all vcpus are
> runnable, yielding optimal results under this test case).
I would think a plain yield() (rather than usleep/directed yield) would suffice
here (yield would realize that there is nobody else to yield to and continue
running the same vcpu thread).
Currently yield() is a no-op on Linux.
As regards to any concern of leaking cpu
bandwidth because of a plain yield, I think it can be fixed by a more
simpler modification to yield that allows a thread to reclaim whatever timeslice
it gave up previously [1].
If some other thread used that timeslice, don't we have an accounting
problem?
Regarding directed yield, do we have any reliable mechanism to find target of
directed yield in this (unmodified/non-paravirtualized guest) case? IOW how do
we determine the vcpu thread to which cycles need to be yielded upon contention?
My idea was to yield to a random starved vcpu of the same guest. There
are several cases to consider:
- we hit the right vcpu; lock is released, party.
- we hit some vcpu that is doing unrelated work. yielding thread
doesn't make progress, but we're not wasting cpu time.
- we hit another waiter for the same lock. it will also PLE exit and
trigger a directed yield. this increases the cost of directed yield by
a factor of count_of_runnable_but_not_running_vcpus, which could be
large, but not disasterously so (i.e. don't run a 64-vcpu guest on a
uniprocessor host with this)
> So if you were to test something similar running with a 20% vcpu
> cap, I'm sure you'd run into similar issues. It may show with fewer
> vcpus (I've only tested 64).
>
> >Are you assuming the existence of a directed yield and the
> >specific concern is what happens when a directed yield happens
> >after a PLE and the target of the yield has been capped?
>
> Yes. My concern is that we will see the same kind of problems
> directed yield was designed to fix, but without allowing directed
> yield to fix them. Directed yield was designed to fix lock holder
> preemption under contention,
For modified guests, something like [2] seems to be the best approach to fix
lock-holder preemption (LHP) problem, which does not require any sort of
directed yield support. Essentially upon contention, a vcpu registers its lock
of interest and goes to sleep (via hypercall) waiting for lock-owner to wake it
up (again via another hypercall).
Right.
For unmodified guests, IMHO a plain yield (or slightly enhanced yield [1])
should fix the LHP problem.
A plain yield (ignoring no-opiness on Linux) will penalize the running
guest wrt other guests. We need to maintain fairness.
Fyi, Xen folks also seem to be avoiding a directed yield for some of the same
reasons [3].
I think that fails for unmodified guests, where you don't know when the
lock is released and so you don't have a wake_up notification. You lost
a large timeslice and you can't gain it back, whereas with pv the wakeup
means you only lose as much time as the lock was held.
Given this line of thinking, hard-limiting guests (either in user-space or
kernel-space, latter being what I prefer) should not have adverse interactions
with LHP-related solutions.
If you hard-limit a vcpu that holds a lock, any waiting vcpus are also
halted. With directed yield you can let the lock holder make some
progress at the expense of another vcpu. A regular yield() will simply
stall the waiter.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html