On Mon, Apr 10, 2023 at 11:53 AM Wyes Karny <wyes.karny@xxxxxxx> wrote: > > The set value of `fast_switch_enabled` flag doesn't guarantee that > fast_switch callback is set. For some drivers such as amd_pstate, the > adjust_perf callback is used but it still sets `fast_switch_possible` > flag. This is not wrong because this flag doesn't imply fast_switch > callback is set, it implies whether the driver can guarantee that > frequency can be changed on any CPU sharing the policy and that the > change will affect all of the policy CPUs without the need to send any > IPIs or issue callbacks from the notifier chain. Therefore add an extra > NULL check before calling fast_switch in sugov_update_single_freq > function. > > Ideally `sugov_update_single_freq` function should not be called with > amd_pstate. But in a corner case scenario, when aperf/mperf overflow > occurs, kernel disables frequency invariance calculation which causes > schedutil to fallback to sugov_update_single_freq which currently relies > on the fast_switch callback. Yes, it does. Which is why that callback must be provided if the driver sets fast_switch_enabled. Overall, adjust_perf is optional, but fast_switch_enabled can only be set if fast_switch is actually present. Please fix the driver. > > Normal flow: > sugov_update_single_perf > cpufreq_driver_adjust_perf > cpufreq_driver->adjust_perf > > Error case flow: > sugov_update_single_perf > sugov_update_single_freq <-- This is chosen because the freq invariant is disabled due to aperf/mperf overflow > cpufreq_driver_fast_switch > cpufreq_driver->fast_switch <-- Here NULL pointer dereference is happening, because fast_switch is not set > > Fix this NULL pointer dereference issue by doing a NULL check. > > Fixes: a61dec744745 ("cpufreq: schedutil: Avoid missing updates for one-CPU policies") > Signed-off-by: Wyes Karny <wyes.karny@xxxxxxx> > > Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > drivers/cpufreq/cpufreq.c | 11 +++++++++++ > include/linux/cpufreq.h | 1 + > kernel/sched/cpufreq_schedutil.c | 2 +- > 3 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 6d8fd3b8dcb5..364d31b55380 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -2138,6 +2138,17 @@ unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, > } > EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch); > > +/** > + * cpufreq_driver_has_fast_switch - Check "fast switch" callback. > + * > + * Return 'true' if the ->fast_switch callback is present for the > + * current driver or 'false' otherwise. > + */ > +bool cpufreq_driver_has_fast_switch(void) > +{ > + return !!cpufreq_driver->fast_switch; > +} > + > /** > * cpufreq_driver_adjust_perf - Adjust CPU performance level in one go. > * @cpu: Target CPU. > diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h > index 65623233ab2f..8a9286fc718b 100644 > --- a/include/linux/cpufreq.h > +++ b/include/linux/cpufreq.h > @@ -604,6 +604,7 @@ struct cpufreq_governor { > /* Pass a target to the cpufreq driver */ > unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, > unsigned int target_freq); > +bool cpufreq_driver_has_fast_switch(void); > void cpufreq_driver_adjust_perf(unsigned int cpu, > unsigned long min_perf, > unsigned long target_perf, > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index e3211455b203..a1c449525ac2 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -364,7 +364,7 @@ static void sugov_update_single_freq(struct update_util_data *hook, u64 time, > * concurrently on two different CPUs for the same target and it is not > * necessary to acquire the lock in the fast switch case. > */ > - if (sg_policy->policy->fast_switch_enabled) { > + if (sg_policy->policy->fast_switch_enabled && cpufreq_driver_has_fast_switch()) { > cpufreq_driver_fast_switch(sg_policy->policy, next_f); > } else { > raw_spin_lock(&sg_policy->update_lock); > -- > 2.34.1 >