Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 18, 2014 at 11:11:43PM -0400, Waiman Long wrote:
> On 03/17/2014 03:10 PM, Konrad Rzeszutek Wilk wrote:
> >On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote:
> >>On 03/14/2014 04:30 AM, Peter Zijlstra wrote:
> >>>On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote:
> >>>>On 03/13/2014 11:15 AM, Peter Zijlstra wrote:
> >>>>>On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote:
> >>>>>>+static inline void arch_spin_lock(struct qspinlock *lock)
> >>>>>>+{
> >>>>>>+	if (static_key_false(&paravirt_unfairlocks_enabled))
> >>>>>>+		queue_spin_lock_unfair(lock);
> >>>>>>+	else
> >>>>>>+		queue_spin_lock(lock);
> >>>>>>+}
> >>>>>So I would have expected something like:
> >>>>>
> >>>>>	if (static_key_false(&paravirt_spinlock)) {
> >>>>>		while (!queue_spin_trylock(lock))
> >>>>>			cpu_relax();
> >>>>>		return;
> >>>>>	}
> >>>>>
> >>>>>At the top of queue_spin_lock_slowpath().
> >>>>I don't like the idea of constantly spinning on the lock. That can cause all
> >>>>sort of performance issues.
> >>>Its bloody virt; _that_ is a performance issue to begin with.
> >>>
> >>>Anybody half sane stops using virt (esp. if they care about
> >>>performance).
> >>>
> >>>>My version of the unfair lock tries to grab the
> >>>>lock ignoring if there are others waiting in the queue or not. So instead of
> >>>>the doing a cmpxchg of the whole 32-bit word, I just do a cmpxchg of the
> >>>>lock byte in the unfair version. A CPU has only one chance to steal the
> >>>>lock. If it can't, it will be lined up in the queue just like the fair
> >>>>version. It is not as unfair as the other unfair locking schemes that spins
> >>>>on the lock repetitively. So lock starvation should be less a problem.
> >>>>
> >>>>On the other hand, it may not perform as well as the other unfair locking
> >>>>schemes. It is a compromise to provide some lock unfairness without
> >>>>sacrificing the good cacheline behavior of the queue spinlock.
> >>>But but but,.. any kind of queueing gets you into a world of hurt with
> >>>virt.
> >>>
> >>>The simple test-and-set lock (as per the above) still sucks due to lock
> >>>holder preemption, but at least the suckage doesn't queue. Because with
> >>>queueing you not only have to worry about the lock holder getting
> >>>preemption, but also the waiter(s).
> >>>
> >>>Take the situation of 3 (v)CPUs where cpu0 holds the lock but is
> >>>preempted. cpu1 queues, cpu2 queues. Then cpu1 gets preempted, after
> >>>which cpu0 gets back online.
> >>>
> >>>The simple test-and-set lock will now let cpu2 acquire. Your queue
> >>>however will just sit there spinning, waiting for cpu1 to come back from
> >>>holiday.
> >>>
> >>>I think you're way over engineering this. Just do the simple
> >>>test-and-set lock for virt&&   !paravirt (as I think Paolo Bonzini
> >>>suggested RHEL6 already does).
> >>The PV ticketlock code was designed to handle lock holder preemption
> >>by redirecting CPU resources in a preempted guest to another guest
> >>that can better use it and then return the preempted CPU back
> >>sooner.
> >>
> >>Using a simple test-and-set lock will not allow us to enable this PV
> >>spinlock functionality as there is no structure to decide who does
> >>what. I can extend the current unfair lock code to allow those
> >And what would be needed to do 'decide who does what'?
> >
> >
> 
> Sorry for not very clear in my previous mail. The current PV code
> will halt the CPUs if the lock isn't acquired in a certain time.
> When the locker holder come back, it will wake up the next CPU in
> the ticket lock queue. With a simple set and test lock, there is no
> structure to decide which one to wake up. So you still need to have
> some sort of queue to do that, you just can't wake up all of them
> and let them fight for the lock.

Why not? That is what bytelocks did and it worked OK in the
virtualization environment. The reason it was OK was because the
hypervisor decided which VCPU to schedule and that one would get the
lock. Granted it did have a per-cpu-what-lock-am-I-waiting-for
structure to aid in this.

In some form, the hypervisor serializes who is going to get
the lock as it ultimately decides which of the VCPUs that are
kicked to wake up - and if the lock is not FIFO - it doesn't
matter which ones gets it.

With the 'per-cpu-what-lock-I-am-waiting-for' you can also
have a round-robin of which vCPU you had kick to keep
it fair.

You might want to take a look at the PV bytelock that existed
in the Xen code prior to PV ticketlock (so 3.10) to see how
it did that.

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux