Hi,
On 4/24/2020 8:07 AM, Stephen Boyd wrote:
Quoting Maulik Shah (2020-04-22 05:14:21)
We discussed this in RFC v2, pasting your reply from same.
That looks like a bug. It appears that gpiolib is only hooking the irq
disable path here so that it can keep track of what irqs are not in use
by gpiolib and allow them to be used for GPIO purposes when the irqs are
disabled. See commit 461c1a7d4733 ("gpiolib: override
irq_enable/disable"). That code in gpiolib should probably see if lazy
mode has been disabled on the irq and do similar things to what genirq
does and then let genirq mask the gpios if they trigger during the time
when the irq is disabled. Regardless of this design though, I don't
understand why this matters.
Yeah. I'm saying that the gpiolib code that forces all gpio irqs to be
non-lazy is broken. Please send a patch to fix gpiolib so that irqs can
be lazy again.
Let me reiterate.
In below genirq code, if irq_chip did not choose to implement
.irq_disable (and the mask argument is coming as false by default during
first disable) from the caller of this,
Then yes control takes lazy path since it doesn't enter to if
(desc->irq_data.chip->irq_disable) or else if (mask) case and the
__irq_disable call simply returns without executing anything.
This leaves IRQ enabled in HW during suspend even though driver done
disable_irq()
static void __irq_disable(struct irq_desc *desc, bool mask)
{
if (irqd_irq_disabled(&desc->irq_data)) {
if (mask)
mask_irq(desc);
} else {
irq_state_set_disabled(desc);
if (desc->irq_data.chip->irq_disable) {
desc->irq_data.chip->irq_disable(&desc->irq_data);
irq_state_set_masked(desc);
} else if (mask) {
mask_irq(desc);
}
}
}
With above understanding, we have two more points to consider
1. The gpiolib registers for .irq_disable for every gpio chip, making it
unlazy for every gpio chip.
2. The irq_chip that registers with gpiolib (from pinctrl-msm.c, also
has .irq_disable implemented, and its parent PDC also has .irq_disable
implemented)
Even if we fix point (1), by making gpiolib replicating genirq code,
It still will call irq_disable (from replicated code in gpiolib now)
since irq_chip that registers with gpiolib has chosen to implement
.irq_disable.
For other irq_chip's which don't implement .irq_disable, it will help.
But since from point (2) we know that PDC irq_chip implements
.irq_disable, and would disable in HW the moment .irq_disable call comes in.
I thought it didn't matter before but now that I've
learned more it sounds like we have to use lazy irqs here so that irqs
stay enabled on the path to suspend while they're disabled but marked
for wakeup.
The description for irq_disable() says...
* If the chip does not implement the irq_disable callback, we
* use a lazy disable approach. That means we mark the interrupt
* disabled, but leave the hardware unmasked. That's an
* optimization...
Since PDC irq_chip implements .irq_disable callback, IRQ gets disabled
in HW immediatly.
The comment says that lazy approch will be used if .irq_disable callback
is not implemented.
IMO, client driver's should not assume/expect that underlying irq_chip
will take lazy path only,
As it depends on whether irq_chip implemented .irq_disable callback or not.
If a driver is relying on lazy disable approach till date, then the
driver should get fixed.
(by removing the unnecessary disable_irq() during suspend)
i suspect this might be the case in EC driver, on the platforms where it
worked fine till today,
irq may not have been disabled in HW (may be that working platform's
irq_chip may not have implemented irq_disable
hence support lazy disable which leaves IRQ unmasked in HW even after
disable_irq() and able to resume from suspend)
Otherwise I don't know how to make this work.
There are 3 possible options to make it work
1. Fix the EC driver (to stop calling disable_irq() during suspend) /
support unlazy disable in EC driver
2. Support lazy disable (requires fixes in gpiolib and irq_chip also
should not implement .irq_disable callback)
With above explanation, The PDC irq_chip want to keep continue implement
.irq_disable callback
(we have different functionality in .irq_mask and .irq_disable
implementation)
so option 2 is ruled out IMO.
3. We use this patch
The current patch makes this work by re-enabling such IRQs in HW during
suspend's very end point, and restores state during resume.
i can not think of any other option.
Let me know if you now agree to go with option (1)
Otherwise, for option (3),
i can post next revision after addressing some minor fixes/suggestions.
I have tried to make this further simpler by not operating on both
PDC-GPIO and PDC irq domains,
but it seems we need to use both irq domains and take the action
accordingly during pdc_suspend() and pdc_resume().
I have missed the default case in switch (noticed same during rpmh patch
doug has posted :-)), as it can get called for CLUSTER notification.
Will fix in next revision.
Thanks,
Maulik
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation