This is a note to let you know that I've just added the patch titled x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram to the 6.7-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: x86-fix-cpuidle_flag_irq_enable-leaking-timer-reprog.patch and it can be found in the queue-6.7 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit a2935f9d4b4d86c6e5f78b50ce5b511c9a1d8871 Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Date: Wed Nov 15 10:13:23 2023 -0500 x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram [ Upstream commit edc8fc01f608108b0b7580cb2c29dfb5135e5f0e ] intel_idle_irq() re-enables IRQs very early. As a result, an interrupt may fire before mwait() is eventually called. If such an interrupt queues a timer, it may go unnoticed until mwait returns and the idle loop handles the tick re-evaluation. And monitoring TIF_NEED_RESCHED doesn't help because a local timer enqueue doesn't set that flag. The issue is mitigated by the fact that this idle handler is only invoked for shallow C-states when, presumably, the next tick is supposed to be close enough. There may still be rare cases though when the next tick is far away and the selected C-state is shallow, resulting in a timer getting ignored for a while. Fix this with using sti_mwait() whose IRQ-reenablement only triggers upon calling mwait(), dealing with the race while keeping the interrupt latency within acceptable bounds. Fixes: c227233ad64c (intel_idle: enable interrupts before C1 on Xeons) Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Acked-by: Rafael J. Wysocki <rafael@xxxxxxxxxx> Link: https://lkml.kernel.org/r/20231115151325.6262-3-frederic@xxxxxxxxxx Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 778df05f8539..bae83810505b 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -115,8 +115,15 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo } __monitor((void *)¤t_thread_info()->flags, 0, 0); - if (!need_resched()) - __mwait(eax, ecx); + + if (!need_resched()) { + if (ecx & 1) { + __mwait(eax, ecx); + } else { + __sti_mwait(eax, ecx); + raw_local_irq_disable(); + } + } } current_clr_polling(); } diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index dcda0afecfc5..3e01a6b23e75 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -131,11 +131,12 @@ static unsigned int mwait_substates __initdata; #define MWAIT2flg(eax) ((eax & 0xFF) << 24) static __always_inline int __intel_idle(struct cpuidle_device *dev, - struct cpuidle_driver *drv, int index) + struct cpuidle_driver *drv, + int index, bool irqoff) { struct cpuidle_state *state = &drv->states[index]; unsigned long eax = flg2MWAIT(state->flags); - unsigned long ecx = 1; /* break on interrupt flag */ + unsigned long ecx = 1*irqoff; /* break on interrupt flag */ mwait_idle_with_hints(eax, ecx); @@ -159,19 +160,13 @@ static __always_inline int __intel_idle(struct cpuidle_device *dev, static __cpuidle int intel_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - return __intel_idle(dev, drv, index); + return __intel_idle(dev, drv, index, true); } static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - int ret; - - raw_local_irq_enable(); - ret = __intel_idle(dev, drv, index); - raw_local_irq_disable(); - - return ret; + return __intel_idle(dev, drv, index, false); } static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, @@ -184,7 +179,7 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, if (smt_active) __update_spec_ctrl(0); - ret = __intel_idle(dev, drv, index); + ret = __intel_idle(dev, drv, index, true); if (smt_active) __update_spec_ctrl(spec_ctrl); @@ -196,7 +191,7 @@ static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { fpu_idle_fpregs(); - return __intel_idle(dev, drv, index); + return __intel_idle(dev, drv, index, true); } /**