Hi Kumar, On Tuesday, March 30, 2021 12:46 PM, Viresh Kumar wrote: > > On 30-03-21, 11:15, Ran Wang wrote: > > When selecting PREEMPT_RT, cpufreq_driver->stop_cpu(policy) might get > > stuck due to irq_work_sync() pending for work on lazy_list, which had > > no chance to be served in softirq context sometimes. > > > > The reason of lazy_list was not served is because the nearest > > activated timer might have been set to expire after long time (such as 100+ seconds). > > Then function run_local_timers() would not call > > raise_softirq(TIMER_SOFTIRQ) to handle enqueued irq_work. > > > > This is observed on LX2160ARDB and LS1088ARDB with cpufreq governor of > > ‘schedutil’ or ‘ondemand’. > > > > Configure related irqwork to run on raw-irq context could fix this issue. > > > > Signed-off-by: Ran Wang <ran.wang_1@xxxxxxx> > > --- > > Change in v2: > > - Update commit message to explain root cause more clear. > > > > drivers/cpufreq/cpufreq_governor.c | 2 +- > > kernel/sched/cpufreq_schedutil.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/cpufreq/cpufreq_governor.c > > b/drivers/cpufreq/cpufreq_governor.c > > index 63f7c219062b..731a7b1434df 100644 > > --- a/drivers/cpufreq/cpufreq_governor.c > > +++ b/drivers/cpufreq/cpufreq_governor.c > > @@ -360,7 +360,7 @@ static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *poli > > policy_dbs->policy = policy; > > mutex_init(&policy_dbs->update_mutex); > > atomic_set(&policy_dbs->work_count, 0); > > - init_irq_work(&policy_dbs->irq_work, dbs_irq_work); > > + policy_dbs->irq_work = IRQ_WORK_INIT_HARD(dbs_irq_work); > > INIT_WORK(&policy_dbs->work, dbs_work_handler); > > > > /* Set policy_dbs for all CPUs, online+offline */ diff --git > > a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > > index 50cbad89f7fa..1d5af87ec92e 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -611,7 +611,7 @@ static int sugov_kthread_create(struct > > sugov_policy *sg_policy) > > > > sg_policy->thread = thread; > > kthread_bind_mask(thread, policy->related_cpus); > > - init_irq_work(&sg_policy->irq_work, sugov_irq_work); > > + sg_policy->irq_work = IRQ_WORK_INIT_HARD(sugov_irq_work); > > mutex_init(&sg_policy->work_lock); > > > > wake_up_process(thread); > > Will this have any impact on the non-preempt-rt case ? Otherwise, My understanding is, in non-preempt-rt case, it will be queued to raised_list instead and call arch_irq_work_raise() immediately to raise a IPI to serve. So that it would be similar to what this patch do in preempt-rt case, see function __irq_work_queue_local(): 53 /* Enqueue on current CPU, work must already be claimed and preempt disabled */ 54 static void __irq_work_queue_local(struct irq_work *work) 55 { 56 struct llist_head *list; 57 bool lazy_work; 58 int work_flags; 59 60 work_flags = atomic_read(&work->node.a_flags); 61 if (work_flags & IRQ_WORK_LAZY) 62 lazy_work = true; 63 else if (IS_ENABLED(CONFIG_PREEMPT_RT) && 64 !(work_flags & IRQ_WORK_HARD_IRQ)) 65 lazy_work = true; 66 else 67 lazy_work = false; And I have tested on mainline and rt tree with CONFIG_PREEMPT selected, couldn't reproduce such issue. Regards, Ran > Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> > > -- > viresh