On 14.12.2022 14:46, Peter Suti wrote: > With the interrupt support introduced in commit 066ecde sometimes the > Marvell-8987 wifi chip got stuck using the marvell-sd-uapsta-8987 > vendor driver. The cause seems to be that after sending ack to all interrupts > the IRQ_SDIO still happens, but it is ignored. > > To work around this, recheck the IRQ_SDIO after meson_mmc_request_done(). > > Inspired by 9e2582e ("mmc: mediatek: fix SDIO irq issue") which used a > similar fix to handle lost interrupts. > The commit description of the referenced fix isn't clear with regard to who's fault it is that an interrupt can be lost. I'd interpret it being a silicon bug rather than a kernel/driver bug. Not sure whether it's the case, but it's possible that both vendors use at least parts of the same IP in the MMC block, and therefore the issue pops up here too. > Fixes: 066ecde ("mmc: meson-gx: add SDIO interrupt support") > > Signed-off-by: Peter Suti <peter.suti@xxxxxxxxxxxxxxxxxxx> > --- > Changes in v2: > - use spin_lock instead of spin_lock_irqsave > - only reenable interrupts if they were enabled already > > Changes in v3: > - Rework the patch based on feedback from Heiner Kallweit. > The IRQ does not happen on 2 CPUs and the hard IRQ is not re-entrant. > But still one SDIO IRQ is lost without this change. > After the ack, reading the SD_EMMC_STATUS BIT(15) is set, but > meson_mmc_irq() is never called again. > > The fix is similar to Mediatek msdc_recheck_sdio_irq(). > That platform also loses an IRQ in some cases it seems. > > drivers/mmc/host/meson-gx-mmc.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c > index 6e5ea0213b47..7d3ee2f9a7f6 100644 > --- a/drivers/mmc/host/meson-gx-mmc.c > +++ b/drivers/mmc/host/meson-gx-mmc.c > @@ -1023,6 +1023,22 @@ static irqreturn_t meson_mmc_irq(int irq, void *dev_id) > if (ret == IRQ_HANDLED) > meson_mmc_request_done(host->mmc, cmd->mrq); > > + /* > + * Sometimes after we ack all raised interrupts, > + * an IRQ_SDIO can still be pending, which can get lost. > + * A reader may scratch his head here and wonder how the interrupt can get lost, and why adding a workaround instead of eliminating the root cause for losing the interrupt. If you can't provide an explanation why the root cause for losing the interrupt can't be fixed, presumably you would have to say that you're adding a workaround for a suspected silicon bug. > + * To prevent this, recheck the IRQ_SDIO here and schedule > + * it to be processed. > + */ > + raw_status = readl(host->regs + SD_EMMC_STATUS); > + status = raw_status & (IRQ_EN_MASK | IRQ_SDIO); This isn't needed here. Why not simply: status = readl(host->regs + SD_EMMC_STATUS); if (status & IRQ_SDIO) ... > + if (status & IRQ_SDIO) { > + spin_lock(&host->lock); > + __meson_mmc_enable_sdio_irq(host->mmc, 0); > + sdio_signal_irq(host->mmc); > + spin_unlock(&host->lock); > + } > + > return ret; > } >