Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> · Fri, 6 Sep 2019 07:14:15 +0200

Hi,

On 06/09/2019 03:48, Ming Lei wrote:

[ ... ]

>> You did not share yet the analysis of the problem (the kernel warnings
>> give the symptoms) and gave the reasoning for the solution. It is hard
>> to understand what you are looking for exactly and how to connect the dots.
> 
> Let me explain it one more time:>
> When one IRQ flood happens on one CPU:
> 
> 1) softirq handling on this CPU can't make progress
> 
> 2) kernel thread bound to this CPU can't make progress
> 
> For example, network may require softirq to xmit packets, or another irq
> thread for handling keyboards/mice or whatever, or rcu_sched may depend
> on that CPU for making progress, then the irq flood stalls the whole
> system.
> 
>>
>> AFAIU, there are fast medium where the responses to requests are faster
>> than the time to process them, right?
> 
> Usually medium may not be faster than CPU, now we are talking about
> interrupts, which can be originated from lots of devices concurrently,
> for example, in Long Li'test, there are 8 NVMe drives involved.
> 
>>
>> I don't see how detecting IRQ flooding and use a threaded irq is the
>> solution, can you explain?
> 
> When IRQ flood is detected, we reserve a bit little time for providing
> chance to make softirq/threads scheduled by scheduler, then the above
> problem can be avoided.
> 
>>
>> If the responses are coming at a very high rate, whatever the solution
>> (interrupts, threaded interrupts, polling), we are still in the same
>> situation.
> 
> When we moving the interrupt handling into irq thread, other softirq/
> threaded interrupt/thread gets chance to be scheduled, so we can avoid
> to stall the whole system.

Ok, so the real problem is per-cpu bounded tasks.

I share Thomas opinion about a NAPI like approach.

I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get
it optimized) to contribute to the CPU load and enforce task migration
at load balance.

>> My suggestion was initially to see if the interrupt load will be taken
>> into accounts in the cpu load and favorize task migration with the
>> scheduler load balance to a less loaded CPU, thus the CPU processing
>> interrupts will end up doing only that while other CPUs will handle the
>> "threaded" side.
>>
>> Beside that, I'm wondering if the block scheduler should be somehow
>> involved in that [1]
> 
> For NVMe or any multi-queue storage, the default scheduler is 'none',
> which basically does nothing except for submitting IO asap.
> 
> 
> Thanks,
> Ming
> 

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog