On 08/24/2018 01:17 AM, Greg KH wrote: > On Thu, Aug 23, 2018 at 05:57:06PM -0500, Grygorii Strashko wrote: >> Hi >> >> On 07/31/2018 05:52 PM, Frederic Weisbecker wrote: >>> Before updating the full nohz tick or the idle time on IRQ exit, we >>> check first if we are not in a nesting interrupt, whether the inner >>> interrupt is a hard or a soft IRQ. >>> >>> There is a historical reason for that: the dyntick idle mode used to >>> reprogram the tick on IRQ exit, after softirq processing, and there was >>> no point in doing that job in the outer nesting interrupt because the >>> tick update will be performed through the end of the inner interrupt >>> eventually, with even potential new timer updates. >>> >>> One corner case could show up though: if an idle tick interrupts a softirq >>> executing inline in the idle loop (through a call to local_bh_enable()) >>> after we entered in dynticks mode, the IRQ won't reprogram the tick >>> because it assumes the softirq executes on an inner IRQ-tail. As a >>> result we might put the CPU in sleep mode with the tick completely >>> stopped whereas a timer can still be enqueued. Indeed there is no tick >>> reprogramming in local_bh_enable(). We probably asssumed there was no bh >>> disabled section in idle, although there didn't seem to be debug code >>> ensuring that. >>> >>> Nowadays the nesting interrupt optimization still stands but only concern >>> full dynticks. The tick is stopped on IRQ exit in full dynticks mode >>> and we want to wait for the end of the inner IRQ to reprogramm the tick. >>> But in_interrupt() doesn't make a difference between softirqs executing >>> on IRQ tail and those executing inline. What was to be considered a >>> corner case in dynticks-idle mode now becomes a serious opportunity for >>> a bug in full dynticks mode: if a tick interrupts a task executing >>> softirq inline, the tick reprogramming will be ignored and we may exit >>> to userspace after local_bh_enable() with an enqueued timer that will >>> never fire. >>> >>> To fix this, simply keep reprogramming the tick if we are in a hardirq >>> interrupting softirq. We can still figure out a way later to restore >>> this optimization while excluding inline softirq processing. >>> >>> Reported-by: Anna-Maria Gleixner <anna-maria@xxxxxxxxxxxxx> >>> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx> >>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >>> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >>> Tested-by: Anna-Maria Gleixner <anna-maria@xxxxxxxxxxxxx> >>> --- >>> kernel/softirq.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/kernel/softirq.c b/kernel/softirq.c >>> index 900dcfe..0980a81 100644 >>> --- a/kernel/softirq.c >>> +++ b/kernel/softirq.c >>> @@ -386,7 +386,7 @@ static inline void tick_irq_exit(void) >>> >>> /* Make sure that timer wheel updates are propagated */ >>> if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu)) { >>> - if (!in_interrupt()) >>> + if (!in_irq()) >>> tick_nohz_irq_exit(); >>> } >>> #endif >>> >> >> This patch was back ported to the Stable linux-4.14.y and It causes regression - >> flood of "NOHZ: local_softirq_pending" messages on all TI boards during boot (NFS boot): >> >> [ 4.179796] NOHZ: local_softirq_pending 2c2 in sirq 256 >> [ 4.185051] NOHZ: local_softirq_pending 2c2 in sirq 256 >> >> the same is not reproducible with LKML - seems due to changes in tick-sched.c >> __tick_nohz_idle_enter()/tick_nohz_irq_exit(). > > What changes do you think fixed this? not sure. But it seems set of changes from Rafael J. Wysocki: ff7de62 nohz: Avoid duplication of code related to got_idle_tick 296bb1e cpuidle: menu: Refine idle state selection for running tick 554c8aa sched: idle: Select idle state before stopping the tick 23a8d88 time: tick-sched: Split tick_nohz_stop_sched_tick() 45f1ff5 cpuidle: Return nohz hint from cpuidle_select() 2aaf709 sched: idle: Do not stop the tick upfront in the idle loop 0e77676 time: tick-sched: Reorganize idle tick management code b7eaf1a cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely > >> I've generated backtrace from can_stop_idle_tick() (see below) and seems this >> patch makes tick_nohz_irq_exit() call unconditional in case of nested interrupt: >> >> gic_handle_irq >> |- irq_exit >> |- preempt_count_sub(HARDIRQ_OFFSET); <-- [1] >> |-__do_softirq >> <irqs enabled> >> |- gic_handle_irq() >> |- irq_exit() >> |- tick_irq_exit() >> if (!in_irq()) <-- My understanding is that this condition will be always true due to [1] >> tick_nohz_irq_exit(); >> |-__tick_nohz_idle_enter() >> |- can_stop_idle_tick() >> >> Sry, not sure if my conclusion is right and how can it be fixed. > > Any pointers to a patch that might need to be backported would be > appreciated. > commit Author: Frederic Weisbecker <frederic@xxxxxxxxxx> Date: Fri Aug 3 15:31:34 2018 +0200 nohz: Fix missing tick reprogram when interrupting an inline softirq commit 0a0e0829f990120cef165bbb804237f400953ec2 upstream. -- regards, -grygorii