On Wed, Jul 17, 2019 at 10:27:30AM -0400, Alex Kogan wrote: > > On Jul 17, 2019, at 4:39 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > static void cna_splice_tail(struct cna_node *cn, struct cna_node *head, struct cna_node *tail) > > { > > struct cna_node *list; > > > > /* remove [head,tail] */ > > WRITE_ONCE(cn->mcs.next, tail->mcs.next); > > tail->mcs.next = NULL; > > > > /* stick [head,tail] on the secondary list tail */ > > if (cn->mcs.locked <= 1) { > > /* create secondary list */ > > head->tail = tail; > > cn->mcs.locked = head->encoded_tail; > > } else { > > /* add to tail */ > > list = (struct cna_node *)decode_tail(cn->mcs.locked); > > list->tail->next = head; > > list->tail = tail; > > } > > } > > > > static struct cna_node *cna_find_next(struct mcs_spinlock *node) > > { > > struct cna_node *cni, *cn = (struct cna_node *)node; > > struct cna_node *head, *tail = NULL; > > > > /* find any next lock from 'our' node */ > > for (head = cni = (struct cna_node *)READ_ONCE(cn->mcs.next); > > cni && cni->node != cn->node; > > tail = cni, cni = (struct cna_node *)READ_ONCE(cni->mcs.next)) > > ; > > > > /* when found, splice any skipped locks onto the secondary list */ > > if (cni && tail) > > cna_splice_tail(cn, head, tail); > > > > return cni; > > } > > > > How's that? > > This is almost perfect!! :) > > The above should work, but I think we should have a specialized fast-path for > checking the immediate next node in the main queue. This would be the common > case, once we splice ‘other’ nodes onto the secondary queue. In the above we > would go through four branches before returning from cna_find_next(). In the > following we would have just one: Right, but can you measure a difference? ;-) Anyway, no real objection, just playing devils advocate here. > > static struct cna_node *cna_find_next(struct mcs_spinlock *node) > > { > > struct cna_node *cn = (struct cna_node *)node; > struct cna_node *cni = (struct cna_node *)READ_ONCE(cn->mcs.next); > > > > struct cna_node *head, *tail = NULL; > > > /* fast path */ > if (cni->node == cn->node) > return cni; > > > /* find any next lock from 'our' node */ > > for (head = cn->mcs.next; head = cni, you just did that load :-) > > cni && cni->node != cn->node; > > tail = cni, cni = (struct cna_node *)READ_ONCE(cni->mcs.next)) > > ; > > > > /* when found, splice any skipped locks onto the secondary list */ > > if (cni && tail) > > cna_splice_tail(cn, head, tail); > > > > return cni; > > } > > > Also, any reason you say ‘lock’ instead of ’node’ in the comments? > I.e., I think "when found, splice any skipped locks onto the secondary list” should be > "when found, splice any skipped nodes onto the secondary list”. Due to the confusion between lock waiter node and numa node :-)