On 7 February 2017 at 12:12, Jean-Nicolas Graux <jean-nicolas.graux@xxxxxx> wrote: > This fixes a race condition that may occur whenever ST micro busy end > interrupt is raised just after being unmasked but before leaving mmci > interrupt context. > > More in details: > > A dead-lock has been found if connecting mmci ST Micro variant whose > amba id is 0x10480180 to some new eMMC that supports internal caches. > Whenever mmci driver enables cache control by programming eMMC's EXT_CSD > register, block driver may request to flush the eMMC internal caches > causing mmci driver to send a MMC_SWITCH command to the card with > FLUSH_CACHE operation. And because busy end interrupt may be mistakenly > cleared while not yet processed, this mmc request may never complete. > As a result, mmcqd task may be stuck forever. > > Here is an instance caught by lockup detector which shows that mmcqd task > was hung while waiting for mmc_flush_cache command to complete: > >>>>> > .. > .. > [ 240.251595] INFO: task mmcqd/1:52 blocked for more than 120 seconds. > [ 240.257973] Not tainted 4.1.13-00510-g9d91424 #2 > [ 240.263109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 240.270955] mmcqd/1 D c047504c 0 52 2 0x00000000 > [ 240.277359] [<c047504c>] (__schedule) from [<c04754a0>] (schedule+0x40/0x98) > [ 240.284418] [<c04754a0>] (schedule) from [<c0477d40>] (schedule_timeout+0x148/0x188) > [ 240.292191] [<c0477d40>] (schedule_timeout) from [<c0476040>] (wait_for_common+0xa4/0x170) > [ 240.300491] [<c0476040>] (wait_for_common) from [<c02efc1c>] (mmc_wait_for_req_done+0x4c/0x13c) > [ 240.309224] [<c02efc1c>] (mmc_wait_for_req_done) from [<c02efd90>] (mmc_wait_for_cmd+0x64/0x84) > [ 240.317953] [<c02efd90>] (mmc_wait_for_cmd) from [<c02f5b14>] (__mmc_switch+0xa4/0x2a8) > [ 240.325964] [<c02f5b14>] (__mmc_switch) from [<c02f5d40>] (mmc_switch+0x28/0x30) > [ 240.333389] [<c02f5d40>] (mmc_switch) from [<c02f0984>] (mmc_flush_cache+0x54/0x80) > [ 240.341073] [<c02f0984>] (mmc_flush_cache) from [<c02ff0c4>] (mmc_blk_issue_rq+0x114/0x4e8) > [ 240.349459] [<c02ff0c4>] (mmc_blk_issue_rq) from [<c03008d4>] (mmc_queue_thread+0xc0/0x180) > [ 240.357844] [<c03008d4>] (mmc_queue_thread) from [<c003cf90>] (kthread+0xdc/0xf4) > [ 240.365339] [<c003cf90>] (kthread) from [<c0010068>] (ret_from_fork+0x14/0x2c) > .. > .. > [ 240.664311] INFO: task partprobe:564 blocked for more than 120 seconds. > [ 240.670943] Not tainted 4.1.13-00510-g9d91424 #2 > [ 240.676078] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 240.683922] partprobe D c047504c 0 564 486 0x00000000 > [ 240.690318] [<c047504c>] (__schedule) from [<c04754a0>] (schedule+0x40/0x98) > [ 240.697396] [<c04754a0>] (schedule) from [<c0477d40>] (schedule_timeout+0x148/0x188) > [ 240.705149] [<c0477d40>] (schedule_timeout) from [<c0476040>] (wait_for_common+0xa4/0x170) > [ 240.713446] [<c0476040>] (wait_for_common) from [<c01f3300>] (submit_bio_wait+0x58/0x64) > [ 240.721571] [<c01f3300>] (submit_bio_wait) from [<c01fbbd8>] (blkdev_issue_flush+0x60/0x88) > [ 240.729957] [<c01fbbd8>] (blkdev_issue_flush) from [<c010ff84>] (blkdev_fsync+0x34/0x44) > [ 240.738083] [<c010ff84>] (blkdev_fsync) from [<c0109594>] (do_fsync+0x3c/0x64) > [ 240.745319] [<c0109594>] (do_fsync) from [<c000ffc0>] (ret_fast_syscall+0x0/0x3c) > .. > .. > <<<< > > Here is the detailed sequence showing when this issue may happen: > > 1) At probe time, mmci device is initialized and card busy detection > based on DAT[0] monitoring is enabled. > > 2) Later during run time, since card reported to support internal > caches, a MMCI_SWITCH command is sent to eMMC device with FLUSH_CACHE > operation. On receiving this command, eMMC may enter busy state > (for a relatively short time in the case of the dead-lock). > > 3) Then mmci interrupt is raised and mmci_irq() is called: > > MMCISTATUS register is read and is equal to 0x01000440. So the following > status bits are set: > - MCI_CMDRESPEND (= 6) > - MCI_DATABLOCKEND (= 10) > - MCI_ST_CARDBUSY (= 24) > > Since MMCIMASK0 register is 0x3FF, status variable is set to 0x00000040 > and BIT MCI_CMDRESPEND is cleared by writing MMCICLEAR register. > > Then mmci_cmd_irq() is called. > Considering the following conditions: > - host->busy_status is 0, > - this is a "busy response", > - reading again MMCISTATUS register gives 0x1000400, > MMCIMASK0 is updated to unmask MCI_ST_BUSYEND bit. > > Thus, MMCIMASK0 is set to 0x010003FF and host->busy_status is set to wait > for busy end completion. > > Back again in status loop of mmci_irq(), we quickly go through > mmci_data_irq() as there are no data in that case. > And we finally go through following test at the end of while(status) loop: > > /* > * Don't poll for busy completion in irq context. > */ > if (host->variant->busy_detect && host->busy_status) > status &= ~host->variant->busy_detect_flag; > > Because status variable is not yet null (is equal to 0x40), > we do not leave interrupt context yet but we loop again into while(status) > loop. So we run across following steps: > > a) MMCISTATUS register is read again and this time is equal to 0x01000400. > > So that following bits are set: > - MCI_DATABLOCKEND (= 10) > - MCI_ST_CARDBUSY (= 24) > > Since MMCIMASK0 register is equal to 0x010003FF: > > b) status variable is set to 0x01000000. > c) MCI_ST_CARDBUSY bit is cleared by writing MMCICLEAR register. > > Then, mmci_cmd_irq() is called one more time. Since host->busy_status is > set and that MCI_ST_CARDBUSY is set in status variable, we just return > from this function. > > Back again in mmci_irq(), status variable is set to 0 and we finally > leave the while(status) loop. As a result we leave interrupt context, > waiting for busy end interrupt event. > > Now, consider that busy end completion is raised IN BETWEEN steps 3.a) > and 3.c). In such a case, we may mistakenly clear busy end interrupt at > step 3.c) while it has not yet been processed. This will result in mmc > command to wait forever for a busy end completion that will never happen. > > This patch implements the following changes: > > Considering that the mmci seems to be triggering the IRQ on both edges while > monitoring DAT0 for busy completion and that same status bit is used to > monitor start and end of busy detection, special care must be taken > to make sure that both start and end interrupts are always cleared one > after the other. Thus: > > 1) Clearing of card busy bit is moved in mmc_cmd_irq() function where > unmasking of busy end bit is effectively handled. > 2) Just before unmasking busy end event, busy start event is cleared > by writing card busy bit in MMCICLEAR register. > 3) Finally, once we are no more busy with a command, busy end event is > cleared writing again card busy bit in MMCICLEAR register. > > This patch has been tested with the ST Accordo5 machine that is not > upstreamed yet but that relies on mmci upstreamed driver. > > Signed-off-by: Sarang Mairal <sarang.mairal@xxxxxxxxxx> > Signed-off-by: Jean-Nicolas Graux <jean-nicolas.graux@xxxxxx> I tested this on my ux500 board, with some local hacks to make sure all three different options for busy detections works (->card_busy(), CMD13, MMC_CAP_WAIT_WHILE_BUSY). Thanks, applied for fixes! Kind regards Uffe > --- > drivers/mmc/host/mmci.c | 32 +++++++++++++++++++++++++------- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c > index 01a8047..b597244 100644 > --- a/drivers/mmc/host/mmci.c > +++ b/drivers/mmc/host/mmci.c > @@ -1023,7 +1023,12 @@ static void mmci_start_data(struct mmci_host *host, struct mmc_data *data) > if (!host->busy_status && busy_resp && > !(status & (MCI_CMDCRCFAIL|MCI_CMDTIMEOUT)) && > (readl(base + MMCISTATUS) & host->variant->busy_detect_flag)) { > - /* Unmask the busy IRQ */ > + > + /* Clear the busy start IRQ */ > + writel(host->variant->busy_detect_mask, > + host->base + MMCICLEAR); > + > + /* Unmask the busy end IRQ */ > writel(readl(base + MMCIMASK0) | > host->variant->busy_detect_mask, > base + MMCIMASK0); > @@ -1038,10 +1043,14 @@ static void mmci_start_data(struct mmci_host *host, struct mmc_data *data) > > /* > * At this point we are not busy with a command, we have > - * not received a new busy request, mask the busy IRQ and > - * fall through to process the IRQ. > + * not received a new busy request, clear and mask the busy > + * end IRQ and fall through to process the IRQ. > */ > if (host->busy_status) { > + > + writel(host->variant->busy_detect_mask, > + host->base + MMCICLEAR); > + > writel(readl(base + MMCIMASK0) & > ~host->variant->busy_detect_mask, > base + MMCIMASK0); > @@ -1283,12 +1292,21 @@ static irqreturn_t mmci_irq(int irq, void *dev_id) > } > > /* > - * We intentionally clear the MCI_ST_CARDBUSY IRQ here (if it's > - * enabled) since the HW seems to be triggering the IRQ on both > - * edges while monitoring DAT0 for busy completion. > + * We intentionally clear the MCI_ST_CARDBUSY IRQ (if it's > + * enabled) in mmci_cmd_irq() function where ST Micro busy > + * detection variant is handled. Considering the HW seems to be > + * triggering the IRQ on both edges while monitoring DAT0 for > + * busy completion and that same status bit is used to monitor > + * start and end of busy detection, special care must be taken > + * to make sure that both start and end interrupts are always > + * cleared one after the other. > */ > status &= readl(host->base + MMCIMASK0); > - writel(status, host->base + MMCICLEAR); > + if (host->variant->busy_detect) > + writel(status & ~host->variant->busy_detect_mask, > + host->base + MMCICLEAR); > + else > + writel(status, host->base + MMCICLEAR); > > dev_dbg(mmc_dev(host->mmc), "irq0 (data+cmd) %08x\n", status); > > -- > 1.9.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html