On 02/05/2019 04:22 AM, Peter Zijlstra wrote: > On Mon, Feb 04, 2019 at 10:35:09PM -0500, Alex Kogan wrote: >>> On Jan 31, 2019, at 5:00 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >>> >>> On Wed, Jan 30, 2019 at 10:01:35PM -0500, Alex Kogan wrote: >>>> Choose the next lock holder among spinning threads running on the same >>>> socket with high probability rather than always. With small probability, >>>> hand the lock to the first thread in the secondary queue or, if that >>>> queue is empty, to the immediate successor of the current lock holder >>>> in the main queue. Thus, assuming no failures while threads hold the >>>> lock, every thread would be able to acquire the lock after a bounded >>>> number of lock transitions, with high probability. >>>> >>>> Note that we could make the inter-socket transition deterministic, >>>> by sticking a counter of intra-socket transitions in the head node >>>> of the secondary queue. At the handoff time, we could increment >>>> the counter and check if it is below a threshold. This adds another >>>> field to queue nodes and nearly-certain local cache miss to read and >>>> update this counter during the handoff. While still beating stock, >>>> this variant adds certain overhead over the probabilistic variant. >>> (also heavily suffers from the socket == node confusion) >>> >>> How would you suggest RT 'tunes' this? >>> >>> RT relies on FIFO fairness of the basic spinlock primitives; you just >>> completely wrecked that. >> This is true that CNA trades some fairness for shorter lock handover >> latency, much like any other NUMA-aware lock. >> >> Can you explain, however, what exactly breaks here? > Timeliness guarantees. FIFO-fair has well defined time behaviour; you > know exactly how long you get to wait before you acquire the lock, > namely however many waiters are in front of you multiplied by the worst > case wait time. > > Doing time analysis on a randomized algorithm isn't my idea of fun. RT doesn't work well with NUMA qspinlock is another reason why I want it to be a separate slow path. We will disable it on a RT kernel where guaranteed low latency is a must and throughput isn't as important. Cheers, Longman