Re: [PATCH] irq: Resolve that mask_irq/unmask_irq may not be called in pairs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2023/12/15 00:11, Andy Shevchenko 写道:
On Thu, Dec 14, 2023 at 01:06:23PM +0300, Serge Semin wrote:
On Thu, Dec 14, 2023 at 09:54:06AM +0800, xiongxin wrote:
在 2023/12/13 22:59, Thomas Gleixner 写道:
On Wed, Dec 13 2023 at 10:29, xiongxin wrote:
在 2023/12/12 23:17, Thomas Gleixner 写道:
Sorry, the previous reply may not have clarified the BUG process. I
re-debugged and confirmed it yesterday. The current BUG execution
sequence is described as follows:

It's the sequence how this works and it works correctly.

Just because it does not work on your machine it does not mean that this
is incorrect and a BUG.

You are trying to fix a symptom and thereby violating guarantees of the
core code.

That is, there is a time between the 1:handle_level_irq() and
3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock
and then implement the irq_state_set_disabled() operation. When finally
call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the
unmask_thread_irq() process.

Correct, because the interrupt has been DISABLED in the mean time.

In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs
are not called in pairs, so I think this is a BUG, but not necessarily
fixed from the irq core code layer.

No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the
interrupt is DISABLED. That's the last time I'm going to tell you that.
Only enable_irq() can undo the effect of disable_irq(), period.

Next, when the gpio controller driver calls the suspend/resume process,
it is as follows:

suspend process:
dwapb_gpio_suspend()
       ctx->int_mask   = dwapb_read(gpio, GPIO_INTMASK);

resume process:
dwapb_gpio_resume()
       dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);

Did you actually look at the sequence I gave you?

     Suspend:

	  i2c_hid_core_suspend()
	     disable_irq();       <- Marks it disabled and eventually
				     masks it.

	  gpio_irq_suspend()
	     save_registers();    <- Saves masked interrupt

     Resume:

	  gpio_irq_resume()
	     restore_registers(); <- Restores masked interrupt

	  i2c_hid_core_resume()
	     enable_irq();        <- Unmasks interrupt and removes the
				     disabled marker


Have you verified that this order of invocations is what happens on
your machine?

Thanks,

          tglx

As described earlier, in the current situation, the irq_mask() callback of
gpio irq_chip is called in mask_irq(), followed by the disable_irq() in
i2c_hid_core_suspend(), unmask_irq() will not be executed.

Then call enable_irq() in i2c_hid_core_resume(). Since gpio irq_chip does
not implement the irq_startup() callback, it ends up calling irq_enable().

The irq_enable() function is then implemented as follows:

irq_state_clr_disabled(desc);
if (desc->irq_data.chip->irq_enable) {
	desc->irq_data.chip->irq_enable(&desc->irq_data);
	irq_state_clr_masked(desc);
} else {
	unmask_irq(desc);
}

Because gpio irq_chip implements irq_enable(), unmask_irq() is not executed,
and gpio irq_chip's irq_unmask() callback is not called. Instead,
irq_state_clr_masked() was called to clear the masked flag.

The irq masked behavior is actually controlled by the
irq_mask()/irq_unmask() callback function pairs in gpio irq_chip. When the
whole situation occurs, there is one more irq_mask() operation, or one less
irq_unmask() operation. This ends the i2c hid resume and the gpio
corresponding i2c hid interrupt is also masked.

Please help confirm whether the current situation is a BUG, or suggest other
solutions to fix it.

I has just been Cc'ed to this thread from the very start of the thread
so not sure whether I fully understand the problem. But a while ago an
IRQ disable/undisable-mask/unmask-unpairing problem was reported for
DW APB GPIO driver implementation, but in a another context though:
https://lore.kernel.org/linux-gpio/1606728979-44259-1-git-send-email-luojiaxing@xxxxxxxxxx/
We didn't come to a final conclusion back then, so the patch got lost
in the emails archive. Xiong, is it related to the problem in your
case?

 From what explained it sounds to me that GPIO driver has missing part and
this seems the missing patch (in one or another form). Perhaps we can get
a second round of review,


Yes, the current case is exactly the situation described in the link, and the specific implementation process is as described in my previous email. After adding the patch for retest, the exception can be effectively solved. And at present, can the patch be incorporated?

Thank you Thomas for your kind advice!

My previous focus has been locked until mask_irq()/unmask_irq() is not called in pairs, in fact, it can turn on the corresponding irq masked in enable_irq. Looking at the irq_enable() callback implementation of other GPIO interrupt controllers, there are indeed operations on masked registers.

Thank you all!





[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux