On Sat, Dec 30, 2023 at 03:49:52PM +0000, David Laight wrote: [...] > I don't completely understand the 'acquire'/'release' semantics (they didn't > exist when I started doing SMP kernel code in the late 1980s). > But it looks odd that osq_unlock()'s fast path uses _release but the very > similar code in osq_wait_next() uses _acquire. > The _release in osq_unlock() is needed since unlocks are needed to be RELEASE so that lock+unlock can be a critical section (i.e. no memory accesses can escape). When osq_wait_next() is used in non unlock cases, the RELEASE is not required. As for the case where osq_wait_next() is used in osq_unlock(), there is a xchg() preceding it, which provides a full barrier, so things are fine. /me wonders whether we can relax the _acquire in osq_wait_next() into a _relaxed. > Indeed, apart from some (assumed) optimisations, I think osq_unlock() > could just be: > next = osq_wait_next(lock, this_cpu_ptr(&osq_node), 0); > if (next) > next->locked = 1; > If so we need to provide some sort of RELEASE semantics for the osq_unlock() in all the cases. Regards, Boqun > I don't think the order of the tests for lock->tail and node->next > matter in osq_wait_next(). > If they were swapped the 'Second most likely case' code from osq_unlock() > could be removed. > (The 'uncontended case' doesn't need to load the address of 'node'.) > > David > > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)