On 2023-03-21 11:24:46 [+0100], Krzysztof Kozlowski wrote: > >> --- a/drivers/cpufreq/qcom-cpufreq-hw.c > >> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c > >> @@ -390,7 +390,16 @@ static irqreturn_t qcom_lmh_dcvs_handle_irq(int irq, void *data) > >> > >> /* Disable interrupt and enable polling */ > >> disable_irq_nosync(c_data->throttle_irq); > >> - schedule_delayed_work(&c_data->throttle_work, 0); > >> + > >> + /* > >> + * Workqueue prefers local CPUs and since interrupts have set affinity, > >> + * the work might execute on a CPU dedicated to realtime tasks. > >> + */ > >> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) > >> + queue_delayed_work_on(WORK_CPU_UNBOUND, system_unbound_wq, > >> + &c_data->throttle_work, 0); > >> + else > >> + schedule_delayed_work(&c_data->throttle_work, 0); > > > > You isolated CPUs and use this on PREEMPT_RT. And this special use-case > > is your reasoning to make this change and let it depend on PREEMPT_RT? > > > > If you do PREEMPT_RT and you care about latency I would argue that you > > either disable cpufreq and set it to PERFORMANCE so that the highest > > available frequency is set once and not changed afterwards. > > The cpufreq is set to performance. It will be changed anyway because > underlying FW notifies through such interrupts about thermal mitigation > happening. I still fail to understand why this is PREEMPT_RT specific and not a problem in general when it comes not NO_HZ_FULL and/ or CPU isolation. However the thermal notifications have nothing to do with cpufreq. > The only other solution is to disable the cpufreq device, e.g. by not > compiling it. People often disable cpufreq because _usually_ the system boots at maximum performance. There are however exceptions and even x86 system are configured sometimes to a lower clock speed by the firmware/ BIOS. In this case it is nice to have a cpufreq so it is possible to set the system during boot to a higher clock speed. And then remain idle unless the cpufreq governor changed. > Best regards, > Krzysztof Sebastian