On Mon, May 12, 2014 at 05:22:08PM +0200, Radim Krčmář wrote: > 2014-05-07 11:01-0400, Waiman Long: > > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > > > Because the qspinlock needs to touch a second cacheline; add a pending > > bit and allow a single in-word spinner before we punt to the second > > cacheline. > > I think there is an unwanted scenario on virtual machines: > 1) VCPU sets the pending bit and start spinning. > 2) Pending VCPU gets descheduled. > - we have PLE and lock holder isn't running [1] > - the hypervisor randomly preempts us > 3) Lock holder unlocks while pending VCPU is waiting in queue. > 4) Subsequent lockers will see free lock with set pending bit and will > loop in trylock's 'for (;;)' > - the worst-case is lock starving [2] > - PLE can save us from wasting whole timeslice > > Retry threshold is the easiest solution, regardless of its ugliness [4]. > > Another minor design flaw is that formerly first VCPU gets appended to > the tail when it decides to queue; > is the performance gain worth it? This is all for real hardware, I've not yet stared at the (para)virt crap. My primary concern is that native hardware runs good and that the (para)virt support does rape the code -- so far its failing hard on the second. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html