On 04/16/2012 09:36 AM, Ian Campbell wrote: > On Mon, 2012-04-16 at 16:44 +0100, Konrad Rzeszutek Wilk wrote: >> On Sat, Mar 31, 2012 at 09:37:45AM +0530, Srivatsa Vaddagiri wrote: >>> * Thomas Gleixner <tglx@xxxxxxxxxxxxx> [2012-03-31 00:07:58]: >>> >>>> I know that Peter is going to go berserk on me, but if we are running >>>> a paravirt guest then it's simple to provide a mechanism which allows >>>> the host (aka hypervisor) to check that in the guest just by looking >>>> at some global state. >>>> >>>> So if a guest exits due to an external event it's easy to inspect the >>>> state of that guest and avoid to schedule away when it was interrupted >>>> in a spinlock held section. That guest/host shared state needs to be >>>> modified to indicate the guest to invoke an exit when the last nested >>>> lock has been released. >>> I had attempted something like that long back: >>> >>> http://lkml.org/lkml/2010/6/3/4 >>> >>> The issue is with ticketlocks though. VCPUs could go into a spin w/o >>> a lock being held by anybody. Say VCPUs 1-99 try to grab a lock in >>> that order (on a host with one cpu). VCPU1 wins (after VCPU0 releases it) >>> and releases the lock. VCPU1 is next eligible to take the lock. If >>> that is not scheduled early enough by host, then remaining vcpus would keep >>> spinning (even though lock is technically not held by anybody) w/o making >>> forward progress. >>> >>> In that situation, what we really need is for the guest to hint to host >>> scheduler to schedule VCPU1 early (via yield_to or something similar). >>> >>> The current pv-spinlock patches however does not track which vcpu is >>> spinning at what head of the ticketlock. I suppose we can consider >>> that optimization in future and see how much benefit it provides (over >>> plain yield/sleep the way its done now). >> Right. I think Jeremy played around with this some time? > 5/11 "xen/pvticketlock: Xen implementation for PV ticket locks" tracks > which vcpus are waiting for a lock in "cpumask_t waiting_cpus" and > tracks which lock each is waiting for in per-cpu "lock_waiting". This is > used in xen_unlock_kick to kick the right CPU. There's a loop over only > the waiting cpus to figure out who to kick. Yes, and AFAIK the KVM pv-ticketlock patches do the same thing. If a (V)CPU is asleep, then sending it a kick is pretty much equivalent to a yield to (not precisely, but it should get scheduled soon enough, and it won't be competing with a pile of VCPUs with no useful work to do). J -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html