On Tue, 3 Sep 2019, Ming Lei wrote: > Scheduler can do nothing if the CPU is taken completely by handling > interrupt & softirq, so seems not a scheduler problem, IMO. Well, but thinking more about it, the solution you are proposing is more a bandaid than anything else. If you look at the networking NAPI mechanism. It handles that situation gracefully by: - Disabling the interrupt at the device level - Polling the device in softirq context until empty and then reenabling interrupts - In case the softirq handles more packets than a defined budget it forces the softirq into the softirqd thread context which also allows rescheduling once the budget is completed. With your adhoc workaround you handle one specific case. But it does not work at all when an overload situation occurs in a case where the queues are truly per cpu simply. Because then the interrupt and the thread affinity are the same and single CPU targets and you replace the interrupt with a threaded handler which runs by default with RT priority. So instead of hacking something half baken into the hard/softirq code, why can't block do a budget limitation and once that is reached switch to something NAPI like as a general solution? Thanks, tglx