On 21/03/2023 11:04, Sebastian Andrzej Siewior wrote: > On 2023-03-15 17:49:10 [+0100], Krzysztof Kozlowski wrote: >> Qualcomm cpufreq driver configures interrupts with affinity to each >> cluster, e.g. dcvsh-irq-0, dcvsh-irq-4 and dcvsh-irq-7 on SM8250. >> Triggered interrupt will schedule delayed work, but, since workqueue >> prefers local CPUs, it might get executed on a CPU dedicated to realtime >> tasks causing unexpected latencies in realtime workload. >> >> Use unbound workqueue for such case. This might come with performance >> or energy penalty, e.g. because of cache miss or when other CPU is >> sleeping. > > I miss the point where it explains that only PREEMPT_RT is affected by > this. I assume "realtime tasks" implies this, but I can make it clearer. > >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx> >> --- >> drivers/cpufreq/qcom-cpufreq-hw.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c >> index 2f581d2d617d..c5ff8d25fabb 100644 >> --- a/drivers/cpufreq/qcom-cpufreq-hw.c >> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c >> @@ -390,7 +390,16 @@ static irqreturn_t qcom_lmh_dcvs_handle_irq(int irq, void *data) >> >> /* Disable interrupt and enable polling */ >> disable_irq_nosync(c_data->throttle_irq); >> - schedule_delayed_work(&c_data->throttle_work, 0); >> + >> + /* >> + * Workqueue prefers local CPUs and since interrupts have set affinity, >> + * the work might execute on a CPU dedicated to realtime tasks. >> + */ >> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) >> + queue_delayed_work_on(WORK_CPU_UNBOUND, system_unbound_wq, >> + &c_data->throttle_work, 0); >> + else >> + schedule_delayed_work(&c_data->throttle_work, 0); > > You isolated CPUs and use this on PREEMPT_RT. And this special use-case > is your reasoning to make this change and let it depend on PREEMPT_RT? > > If you do PREEMPT_RT and you care about latency I would argue that you > either disable cpufreq and set it to PERFORMANCE so that the highest > available frequency is set once and not changed afterwards. The cpufreq is set to performance. It will be changed anyway because underlying FW notifies through such interrupts about thermal mitigation happening. The only other solution is to disable the cpufreq device, e.g. by not compiling it. Best regards, Krzysztof