Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA

Waiman Long <longman@xxxxxxxxxx> · Fri, 24 Jan 2020 13:51:34 -0500



On 1/24/20 1:40 PM, Waiman Long wrote:
> On 1/24/20 1:19 PM, Alex Kogan wrote:
>>
>>
>>> On Jan 24, 2020, at 11:46 AM, Waiman Long <longman@xxxxxxxxxx
>>> <mailto:longman@xxxxxxxxxx>> wrote:
>>>
>>> On 1/24/20 11:29 AM, Alex Kogan wrote:
>>>>
>>>>
>>>>> On Jan 24, 2020, at 10:19 AM, Waiman Long <longman@xxxxxxxxxx
>>>>> <mailto:longman@xxxxxxxxxx>> wrote:
>>>>>
>>>>> On 1/24/20 9:42 AM, Waiman Long wrote:
>>>>>> On 1/24/20 2:52 AM, Peter Zijlstra wrote:
>>>>>>> On Thu, Jan 23, 2020 at 04:33:54PM -0500, Alex Kogan wrote:
>>>>>>>> Let me put this question to you. What do you think the number
>>>>>>>> should be?
>>>>>>> I think it would be very good to keep the inter-node latency
>>>>>>> below 1ms.
>>>>>> It is hard to guarantee that given that lock hold times can vary
>>>>>> quite a
>>>>>> lot depending on the workload. What we can control is just how many
>>>>>> later lock waiters can jump ahead before a given waiter.
>>>> I totally agree. I do not think you can guarantee that latency even
>>>> today.
>>>> With the existing spinlock, you join the queue and wait for as long
>>>> as it takes
>>>> for each and every thread in front of you to execute its critical
>>>> section.
>>>>
>>>>>>> But to realize that we need data on the lock hold times.
>>>>>>> Specifically
>>>>>>> for the heavily contended locks that make CNA worth it in the first
>>>>>>> place.
>>>>>>>
>>>>>>> I don't see that data, so I don't see how we can argue about
>>>>>>> this let
>>>>>>> alone call something reasonable.
>>>>>>>
>>>>>> In essence, CNA lock is for improving throughput on NUMA machines
>>>>>> at the
>>>>>> expense of increasing worst case latency. If low latency is
>>>>>> important,
>>>>>> it should be disabled. If CONFIG_PREEMPT_RT is on,
>>>>>> CONFIG_NUMA_AWARE_SPINLOCKS should be off.
>>>>>
>>>>> Actually, what we are worrying about is the additional latency
>>>>> that can
>>>>> be added to important tasks or execution contexts that are waiting
>>>>> for a
>>>>> lock. Maybe we can make CNA lock behaves somewhat like qrwlock is that
>>>>> requests from interrupt context are giving priority. We could add a
>>>>> priority flag in the CNA node. If the flag is set, we will never
>>>>> put it
>>>>> into the secondary queue. In fact, we can transfer control next to it
>>>>> even if it is not on the same node. We may also set the priority
>>>>> flag if
>>>>> it is a RT task that is trying to acquire the lock.
>>>> I think this is possible, and in fact, we have been thinking along
>>>> those lines
>>>> about ways to better support RT tasks with CNA. However, this will
>>>> _probably
>>>> require changes to API and will _certainly complicate the code
>>>> quite a bit.
>>>
>>> What you need to do is to modify cna_init_node() to check the
>>> current locking context and set the priority flag accordingly.
>>>
>> Is there a lightweight way to identify such a “prioritized” thread?
>
> You can use the in_task() macro in include/linux/preempt.h. This is
> just a percpu preempt_count read and test. If in_task() is false, it
> is in a {soft|hard}irq or nmi context. If it is true, you can check
> the rt_task() macro to see if it is an RT task. That will access to
> the current task structure. So it may cost a little bit more if you
> want to handle the RT task the same way.
>
We may not need to do that for softIRQ context. If that is the case, you
can use in_irq() which checks for hardirq and nmi only. Peter, what is
your thought on that?

Cheers,
Longman