This fix addresses a stale task completion event issued right after the CQE recovery. As it's a hardware issue the fix is done in form of a quirk. When error interrupt is received the driver runs recovery logic is run. It halts the controller, clears all pending tasks, and then re-enables it. On some platforms a stale task completion event is observed, regardless of the CQHCI_CLEAR_ALL_TASKS bit being set. This results in either: a) Spurious TC completion event for an empty slot. b) Corrupted data being passed up the stack, as a result of premature completion for a newly added task. To fix that re-enable the controller, clear task completion bits, interrupt status register and halt it again. This is done at the end of the recovery process, right before interrupts are re-enabled. Signed-off-by: Kornel Dulęba <korneld@xxxxxxxxxxxx> --- drivers/mmc/host/cqhci-core.c | 42 +++++++++++++++++++++++++++++++++++ drivers/mmc/host/cqhci.h | 1 + 2 files changed, 43 insertions(+) diff --git a/drivers/mmc/host/cqhci-core.c b/drivers/mmc/host/cqhci-core.c index b3d7d6d8d654..e534222df90c 100644 --- a/drivers/mmc/host/cqhci-core.c +++ b/drivers/mmc/host/cqhci-core.c @@ -1062,6 +1062,45 @@ static void cqhci_recover_mrqs(struct cqhci_host *cq_host) /* CQHCI could be expected to clear it's internal state pretty quickly */ #define CQHCI_CLEAR_TIMEOUT 20 +/* + * During CQE recovery all pending tasks are cleared from the + * controller and its state is being reset. + * On some platforms the controller sets a task completion bit for + * a stale(previously cleared) task right after being re-enabled. + * This results in a spurious interrupt at best and corrupted data + * being passed up the stack at worst. The latter happens when + * the driver enqueues a new request on the problematic task slot + * before the "spurious" task completion interrupt is handled. + * To fix it: + * 1. Re-enable controller by clearing the halt flag. + * 2. Clear interrupt status and the task completion register. + * 3. Halt the controller again to be consistent with quirkless logic. + * + * This assumes that there are no pending requests on the queue. + */ +static void cqhci_quirk_clear_stale_tc(struct cqhci_host *cq_host) +{ + u32 reg; + + WARN_ON(cq_host->qcnt); + cqhci_writel(cq_host, 0, CQHCI_CTL); + if ((cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT)) { + pr_err("%s: cqhci: CQE failed to exit halt state\n", + mmc_hostname(cq_host->mmc)); + } + reg = cqhci_readl(cq_host, CQHCI_TCN); + cqhci_writel(cq_host, reg, CQHCI_TCN); + reg = cqhci_readl(cq_host, CQHCI_IS); + cqhci_writel(cq_host, reg, CQHCI_IS); + + /* + * Halt the controller again. + * This is only needed so that we're consistent across quirk + * and quirkless logic. + */ + cqhci_halt(cq_host->mmc, CQHCI_FINISH_HALT_TIMEOUT); +} + static void cqhci_recovery_finish(struct mmc_host *mmc) { struct cqhci_host *cq_host = mmc->cqe_private; @@ -1108,6 +1147,9 @@ static void cqhci_recovery_finish(struct mmc_host *mmc) mmc->cqe_on = false; spin_unlock_irqrestore(&cq_host->lock, flags); + if (cq_host->quirks & CQHCI_QUIRK_CLEAR_STALE_TC) + cqhci_quirk_clear_stale_tc(cq_host); + /* Ensure all writes are done before interrupts are re-enabled */ wmb(); diff --git a/drivers/mmc/host/cqhci.h b/drivers/mmc/host/cqhci.h index 1a12e40a02e6..36131038c091 100644 --- a/drivers/mmc/host/cqhci.h +++ b/drivers/mmc/host/cqhci.h @@ -239,6 +239,7 @@ struct cqhci_host { u32 quirks; #define CQHCI_QUIRK_SHORT_TXFR_DESC_SZ 0x1 +#define CQHCI_QUIRK_CLEAR_STALE_TC 0x2 bool enabled; bool halted; -- 2.42.0.820.g83a721a137-goog