Hi Thomas, Xiong On Thu, Dec 14, 2023 at 09:54:06AM +0800, xiongxin wrote: > 在 2023/12/13 22:59, Thomas Gleixner 写道: > > On Wed, Dec 13 2023 at 10:29, xiongxin wrote: > > > 在 2023/12/12 23:17, Thomas Gleixner 写道: > > > Sorry, the previous reply may not have clarified the BUG process. I > > > re-debugged and confirmed it yesterday. The current BUG execution > > > sequence is described as follows: > > > > It's the sequence how this works and it works correctly. > > > > Just because it does not work on your machine it does not mean that this > > is incorrect and a BUG. > > > > You are trying to fix a symptom and thereby violating guarantees of the > > core code. > > > > > That is, there is a time between the 1:handle_level_irq() and > > > 3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock > > > and then implement the irq_state_set_disabled() operation. When finally > > > call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the > > > unmask_thread_irq() process. > > > > Correct, because the interrupt has been DISABLED in the mean time. > > > > > In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs > > > are not called in pairs, so I think this is a BUG, but not necessarily > > > fixed from the irq core code layer. > > > > No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the > > interrupt is DISABLED. That's the last time I'm going to tell you that. > > Only enable_irq() can undo the effect of disable_irq(), period. > > > > > Next, when the gpio controller driver calls the suspend/resume process, > > > it is as follows: > > > > > > suspend process: > > > dwapb_gpio_suspend() > > > ctx->int_mask = dwapb_read(gpio, GPIO_INTMASK); > > > > > > resume process: > > > dwapb_gpio_resume() > > > dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask); > > > > Did you actually look at the sequence I gave you? > > > > Suspend: > > > > i2c_hid_core_suspend() > > disable_irq(); <- Marks it disabled and eventually > > masks it. > > > > gpio_irq_suspend() > > save_registers(); <- Saves masked interrupt > > > > Resume: > > > > gpio_irq_resume() > > restore_registers(); <- Restores masked interrupt > > > > i2c_hid_core_resume() > > enable_irq(); <- Unmasks interrupt and removes the > > disabled marker > > > > > > Have you verified that this order of invocations is what happens on > > your machine? > > > > Thanks, > > > > tglx > > As described earlier, in the current situation, the irq_mask() callback of > gpio irq_chip is called in mask_irq(), followed by the disable_irq() in > i2c_hid_core_suspend(), unmask_irq() will not be executed. > > Then call enable_irq() in i2c_hid_core_resume(). Since gpio irq_chip does > not implement the irq_startup() callback, it ends up calling irq_enable(). > > The irq_enable() function is then implemented as follows: > > irq_state_clr_disabled(desc); > if (desc->irq_data.chip->irq_enable) { > desc->irq_data.chip->irq_enable(&desc->irq_data); > irq_state_clr_masked(desc); > } else { > unmask_irq(desc); > } > > Because gpio irq_chip implements irq_enable(), unmask_irq() is not executed, > and gpio irq_chip's irq_unmask() callback is not called. Instead, > irq_state_clr_masked() was called to clear the masked flag. > > The irq masked behavior is actually controlled by the > irq_mask()/irq_unmask() callback function pairs in gpio irq_chip. When the > whole situation occurs, there is one more irq_mask() operation, or one less > irq_unmask() operation. This ends the i2c hid resume and the gpio > corresponding i2c hid interrupt is also masked. > > Please help confirm whether the current situation is a BUG, or suggest other > solutions to fix it. > I has just been Cc'ed to this thread from the very start of the thread so not sure whether I fully understand the problem. But a while ago an IRQ disable/undisable-mask/unmask-unpairing problem was reported for DW APB GPIO driver implementation, but in a another context though: https://lore.kernel.org/linux-gpio/1606728979-44259-1-git-send-email-luojiaxing@xxxxxxxxxx/ We didn't come to a final conclusion back then, so the patch got lost in the emails archive. Xiong, is it related to the problem in your case? -Serge(y)