On 1/24/20 4:27 PM, Alex Kogan wrote: > >> On Jan 24, 2020, at 4:12 PM, Waiman Long <longman@xxxxxxxxxx> wrote: >> >> On 1/24/20 3:09 PM, Alex Kogan wrote: >>>>> We also probably do not want those “prioritized” threads to disrupt >>>>> normal >>>>> CNA operation. E.g., if the main queue looks like T1_1, P2_1, T1_2, >>>>> …, where >>>>> T1_x is a thread running on node 1 and P2_1 is a prioritized thread >>>>> running >>>>> on node 2, we want to pass the lock from T1_1 to P2_1 and then to T1_2 >>>>> (rather than have P2_1 to scan for another thread on node 2). >>>>> >>>>> There is a way to achieve that — when we pass the lock to P2_1, >>>>> we can set its numa node field to 1. This means that we need to >>>>> reset the numa >>>>> node field in cna_init_node(), but AFAICT this is relatively cheap. >>>>> The rest >>>>> of the CNA logic should not change. >>>> I won't recommend doing that. If the lock cacheline has been moved >>>> from node 1 to 2, I will say it is better to stick with node 2 rather >>>> than switching back to node 1. That will mean that the secondary >>>> queue may contain lock waiters from the same nodes, but they will >>>> eventually be flushed back to the primary queue. >>>> >>> That’s right, assuming we do not reset intra_node count when >>> transferring the >>> lock to a prioritized thread from another node. Otherwise, we may starve >>> waiters in the secondary queue. >>> >>> Still, that can make the lock even less fair to non-prioritized >>> threads. When >>> you flush the secondary queue, the preference may remain with the same >>> node. This will not happen in the current form of CNA, as we never get >>> threads from the preferred node in the secondary queue. >> That is true. >> >> However, it is no different from the current scheme that a waiter from >> another node may have to wait for 64k other waiters to go first before >> it has a chance to get it. Now that waiter can be from the same node as >> well. > The difference is that in the current form of CNA, the preferred node _will > change after 64k lock transitions. In the change you propose, this is no > longer the case. It may take another ~64k transitions for that to happen. > More generally, I think this makes the analysis of the lock behavior more > convoluted. > > I think we should treat those prioritized threads as “wild” cards, passing the > lock through them, but keeping the preferred node intact. This will potentially > cost one extra lock migration, but will make reasoning about the lock > behavior easier. It seems like you prefer mathematical purity than practical consideration. I won't object to that if you insist that is the right way to go. Just be mindful that you may need to add a preferred node value to do that. It will also complicate the code, but it is your choice. Cheers, Longman