When we run out of maximum rqnodes, the original queued spin lock slow path falls back to a try lock. In such a case, we are again susceptible to stalls in case the lock owner fails to make progress. We use the timeout as a fallback to break out of this loop and return to the caller. This is a fallback for an extreme edge case, when on the same CPU we run out of all 4 qnodes. When could this happen? We are in slow path in task context, we get interrupted by an IRQ, which while in the slow path gets interrupted by an NMI, whcih in the slow path gets another nested NMI, which enters the slow path. All of the interruptions happen after node->count++. We use RES_DEF_TIMEOUT as our spinning duration, but in the case of this fallback, no fairness is guaranteed, so the duration may be too small for contended cases, as the waiting time is not bounded. Since this is an extreme corner case, let's just prefer timing out instead of attempting to spin for longer. Reviewed-by: Barret Rhoden <brho@xxxxxxxxxx> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> --- kernel/locking/rqspinlock.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 9ad18b3c46f7..16ec1b9eb005 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -271,8 +271,14 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) */ if (unlikely(idx >= _Q_MAX_NODES)) { lockevent_inc(lock_no_node); - while (!queued_spin_trylock(lock)) + RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT); + while (!queued_spin_trylock(lock)) { + if (RES_CHECK_TIMEOUT(ts, ret)) { + lockevent_inc(rqspinlock_lock_timeout); + break; + } cpu_relax(); + } goto release; } -- 2.43.5