From: Waiman Long > Sent: 30 December 2023 03:20 > > On 12/29/23 17:11, David Laight wrote: > > osq_lock() starts by setting node->next to NULL and node->locked to 0. > > Careful analysis shows that node->next is always NULL on entry. > > > > node->locked is set non-zero by another cpu to force a wakeup. > > This can only happen after the 'prev->next = node' assignment, > > so locked can be set to zero just before that (along with the assignment > > to node->prev). > > > > Only initialise node->cpu once, after that use its value instead > > of smp_processor_id() - which is probably a real function call. > > > > Should reduce cache-line bouncing a little. > > > > Signed-off-by: David Laight <david.laight@xxxxxxxxxx> > > --- > > > > Re-send without the 'RE:' on the subject line. > > kernel/locking/osq_lock.c | 13 ++++++------- > > 1 file changed, 6 insertions(+), 7 deletions(-) > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > index d414eef4bec6..55f5db896c02 100644 > > --- a/kernel/locking/osq_lock.c > > +++ b/kernel/locking/osq_lock.c > > @@ -51,7 +51,7 @@ osq_wait_next(struct optimistic_spin_queue *lock, > > struct optimistic_spin_node *prev) > > { > > struct optimistic_spin_node *next = NULL; > > - int curr = encode_cpu(smp_processor_id()); > > + int curr = node->cpu; > > int old; > > > > /* > > @@ -98,12 +98,10 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > { > > struct optimistic_spin_node *node = this_cpu_ptr(&osq_node); > > struct optimistic_spin_node *prev, *next; > > - int curr = encode_cpu(smp_processor_id()); > > int old; > > > > - node->locked = 0; > > - node->next = NULL; > > I have some concern about not clearing node->next at the beginning. I > know that next is supposed to be cleared at the end. I do have some > worry that there may exist a race condition that leaves next not cleared > before it is used again. So you either have to prove that such a race > does not exist, or initializing it to NULL at the beginning like it is > today. I've stared at the code some more :-) There are two places where node->next is written non-NULL, both in osq_lock(). The first is at the top of the slow-path where prev->next = node (this should be overwriting the NULL set (or not) on entry). The second is at the bottom of osq_lock() prev->next = next (Step C) where the NULL written in Step A is written with the correct 'next' node. After either of those the 'node' cpu must later either take the unqueue exit from osq_lock() or call osq_unlock(). Both require it wait for node->next be non-NULL and NULL it. If code calls osq_lock() twice all bets are off! Even if (somehow) node->next was non-NULL on entry it will be set by an osq_lock() call from another cpu. The only problem would be if osq_unlock() were called before the queueing cpu set prev->next = node. That in itself is unlikely - but would happen if node->next were always set. I don't completely understand the 'acquire'/'release' semantics (they didn't exist when I started doing SMP kernel code in the late 1980s). But it looks odd that osq_unlock()'s fast path uses _release but the very similar code in osq_wait_next() uses _acquire. Indeed, apart from some (assumed) optimisations, I think osq_unlock() could just be: next = osq_wait_next(lock, this_cpu_ptr(&osq_node), 0); if (next) next->locked = 1; I don't think the order of the tests for lock->tail and node->next matter in osq_wait_next(). If they were swapped the 'Second most likely case' code from osq_unlock() could be removed. (The 'uncontended case' doesn't need to load the address of 'node'.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)