Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA

Waiman Long <longman@xxxxxxxxxx> · Fri, 24 Jan 2020 19:38:50 -0500

On 1/24/20 4:27 PM, Alex Kogan wrote:
>
>> On Jan 24, 2020, at 4:12 PM, Waiman Long <longman@xxxxxxxxxx> wrote:
>>
>> On 1/24/20 3:09 PM, Alex Kogan wrote:
>>>>> We also probably do not want those “prioritized” threads to disrupt
>>>>> normal
>>>>> CNA operation. E.g., if the main queue looks like T1_1, P2_1, T1_2,
>>>>> …, where
>>>>> T1_x is a thread running on node 1 and P2_1 is a prioritized thread
>>>>> running
>>>>> on node 2, we want to pass the lock from T1_1 to P2_1 and then to T1_2
>>>>> (rather than have P2_1 to scan for another thread on node 2).
>>>>>
>>>>> There is a way to achieve that — when we pass the lock to P2_1,
>>>>> we can set its numa node field to 1. This means that we need to
>>>>> reset the numa
>>>>> node field in cna_init_node(), but AFAICT this is relatively cheap.
>>>>> The rest
>>>>> of the CNA logic should not change.
>>>> I won't recommend doing that. If the lock cacheline has been moved
>>>> from node 1 to 2, I will say it is better to stick with node 2 rather
>>>> than switching back to node 1. That will mean that the secondary
>>>> queue may contain lock waiters from the same nodes, but they will
>>>> eventually be flushed back to the primary queue.
>>>>
>>> That’s right, assuming we do not reset intra_node count when
>>> transferring the
>>> lock to a prioritized thread from another node. Otherwise, we may starve
>>> waiters in the secondary queue. 
>>>
>>> Still, that can make the lock even less fair to non-prioritized
>>> threads. When
>>> you flush the secondary queue, the preference may remain with the same
>>> node. This will not happen in the current form of CNA, as we never get 
>>> threads from the preferred node in the secondary queue.
>> That is true.
>>
>> However, it is no different from the current scheme that a waiter from
>> another node may have to wait for 64k other waiters to go first before
>> it has a chance to get it. Now that waiter can be from the same node as
>> well.
> The difference is that in the current form of CNA, the preferred node _will
> change after 64k lock transitions. In the change you propose, this is no
> longer the case. It may take another ~64k transitions for that to happen.
> More generally, I think this makes the analysis of the lock behavior more
> convoluted.
>
> I think we should treat those prioritized threads as “wild” cards, passing the 
> lock through them, but keeping the preferred node intact. This will potentially
> cost one extra lock migration, but will make reasoning about the lock
> behavior easier.

It seems like you prefer mathematical purity than practical
consideration. I won't object to that if you insist that is the right
way to go. Just be mindful that you may need to add a preferred node
value to do that. It will also complicate the code, but it is your choice.

Cheers,
Longman