[omap_hsmmc] run into dead loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

    I found that there may be a dead loop in omap-hsmmc driver:
    drivers/mmc/host/omap_hsmmc.c

    My testing is on v4.19.58-0-g7a6bfa08b938, but the latest
    kernel(v5.4-rc5-19-g8005803a2ca0) has the same bug(I guess).

    In omap_hsmmc_start_command(), the IRQ is enabled too early:
        omap_hsmmc_start_command(struct omap_hsmmc_host *host, struct
mmc_command *cmd,
                struct mmc_data *data)
        {
                /* ..................... */
                omap_hsmmc_enable_irq(host, cmd);

                /* ..................... */
                /*  If IRQ occurs here, we run in deadloop! */

                host->req_in_progress = 1;

                /* ..................... */
        }

    In OMAP-HSMMC ISR, the C code is:
        static irqreturn_t omap_hsmmc_irq(int irq, void *dev_id)
        {
                struct omap_hsmmc_host *host = dev_id;
                int status;

                status = OMAP_HSMMC_READ(host->base, STAT);
                while (status & (INT_EN_MASK | CIRQ_EN)) {
                        if (host->req_in_progress)
                                omap_hsmmc_do_irq(host, status);

                        if (status & CIRQ_EN)
                                mmc_signal_sdio_irq(host->mmc);

                        /* Flush posted write */
                        status = OMAP_HSMMC_READ(host->base, STAT);
                }

                return IRQ_HANDLED;
        }

    It checks the flag host->req_in_progress, only it is set,
    omap_hsmmc_do_irq() will be called.  The STAT register will be
    written in omap_hsmmc_do_irq(), that will cause the interrupts to
    be reset.

    If MMC IRQ occurs in that danger time-window, my single core CPU
    will run into endless loop in the ISR omap_hsmmc_irq(), due to
    STAT register is never written to clean the IRQ.

    Due to the small time-window, this bug is some what difficult to
    be found. I add some debug info dumpping in that two functions,
    that will cause the code slower, the bug will occur easier.  It is
    realy a bug. The debug version of the source and the debug log is
    some what tedious. I only add the snippet here:

    The bug log:
        [   12.765560] omap_hsmmc_runtime_resume:2354: omap_hsmmc
48060000.mmc: enabled
        [   12.773006] mmc_mrq_pr_debug:279: mmc0: starting CMD6 arg
03200101 flags 0000049d
        [   12.780941] omap_hsmmc_start_command:853: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101
        [   12.790396] omap_hsmmc_start_command:859: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.800662] omap_hsmmc_start_command:862: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.811054] omap_hsmmc_start_command:871: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.821331] omap_hsmmc_start_command:878: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.831596] omap_hsmmc_start_command:888: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.841861] omap_hsmmc_start_command:919: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.852124] omap_hsmmc_start_command:923: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   12.862393] omap_hsmmc_irq:1207: omap_hsmmc 48060000.mmc:
mmc0: enter ISR, irq=34, status=1
        [   12.871155] omap_hsmmc_do_irq:1157: omap_hsmmc
48060000.mmc: IRQ Status is 1
        [   12.878553] omap_hsmmc_irq:1226: omap_hsmmc 48060000.mmc:
mmc0: leave ISR, irq=34
        [   12.886450] omap_hsmmc_start_command:939: omap_hsmmc
48060000.mmc: mmc0: CMD6, tracing
        [   12.894805] CPU: 0 PID: 95 Comm: kworker/0:1H Tainted: G
    W  O      4.19.58-0-g7a6bfa08b-cp5502 #25
        [   12.904878] Hardware name: Generic AM33XX (Flattened Device Tree)
        [   12.911292] Workqueue: kblockd blk_mq_run_work_fn
        [   12.916316] [<c010f7fc>] (unwind_backtrace) from
[<c010ce4c>] (show_stack+0x18/0x1c)
        [   12.924506] [<c010ce4c>] (show_stack) from [<c0629cac>]
(omap_hsmmc_request+0x2b0/0x4b0)
        [   12.933014] [<c0629cac>] (omap_hsmmc_request) from
[<c06147b8>] (mmc_start_request+0xa4/0xb8)
        [   12.942015] [<c06147b8>] (mmc_start_request) from
[<c0614834>] (mmc_wait_for_req+0x68/0xd0)
        [   12.950828] [<c0614834>] (mmc_wait_for_req) from
[<c0614918>] (mmc_wait_for_cmd+0x7c/0xa4)
        [   12.959552] [<c0614918>] (mmc_wait_for_cmd) from
[<c061bbe0>] (__mmc_switch+0xe0/0x2cc)
        [   12.967998] [<c061bbe0>] (__mmc_switch) from [<c061bdf8>]
(mmc_switch+0x2c/0x34)
        [   12.975805] [<c061bdf8>] (mmc_switch) from [<c061be58>]
(mmc_flush_cache+0x58/0x84)
        [   12.983844] [<c061be58>] (mmc_flush_cache) from
[<c0626f84>] (mmc_blk_mq_issue_rq+0x5d8/0x76c)
        [   12.992927] [<c0626f84>] (mmc_blk_mq_issue_rq) from
[<c0627644>] (mmc_mq_queue_rq+0x158/0x1fc)
        [   13.002056] [<c0627644>] (mmc_mq_queue_rq) from
[<c0470590>] (blk_mq_dispatch_rq_list+0x384/0x4c4)
        [   13.011521] [<c0470590>] (blk_mq_dispatch_rq_list) from
[<c047531c>] (blk_mq_sched_dispatch_requests+0xfc/0x174)
        [   13.022248] [<c047531c>] (blk_mq_sched_dispatch_requests)
from [<c046f020>] (__blk_mq_run_hw_queue+0xcc/0x110)
        [   13.032793] [<c046f020>] (__blk_mq_run_hw_queue) from
[<c01520f0>] (process_one_work+0x364/0x64c)
        [   13.042142] omap_hsmmc_irq:1207: omap_hsmmc 48060000.mmc:
mmc0: enter ISR, irq=34, status=108000
        [   13.042156] omap_hsmmc_do_irq:1157: omap_hsmmc
48060000.mmc: IRQ Status is 108000
        [   13.042173] omap_hsmmc_dbg_report_irq:1088: omap_hsmmc
48060000.mmc: MMC IRQ 0x108000 : ERRI DTO
        [   13.042196] omap_hsmmc_irq:1226: omap_hsmmc 48060000.mmc:
mmc0: leave ISR, irq=34
        [   13.076452] [<c01520f0>] (process_one_work) from
[<c0153020>] (worker_thread+0x294/0x408)
        [   13.085089] [<c0153020>] (worker_thread) from [<c0158624>]
(kthread+0x140/0x15c)
        [   13.092861] [<c0158624>] (kthread) from [<c01010b4>]
(ret_from_fork+0x14/0x20)
        [   13.100482] Exception stack(0xdd179fb0 to 0xdd179ff8)
        [   13.105828] 9fa0:
00000000 00000000 00000000 00000000
        [   13.114411] 9fc0: 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
        [   13.123032] 9fe0: 00000000 00000000 00000000 00000000
00000013 00000000
        [   13.130027] mmc_wait_for_req_done:433: mmc0: req failed
(CMD6): -110, cmd->retries=3, retrying...
        [   13.139390] omap_hsmmc_start_command:853: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101
        [   13.148839] omap_hsmmc_start_command:859: omap_hsmmc
48060000.mmc: mmc0: CMD6, argument 0x03200101, tracing
        [   13.148855] omap_hsmmc_irq:1207: omap_hsmmc 48060000.mmc:
mmc0: enter ISR, irq=34, status=108000
        [   13.168274] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108000, req_in_progress=0, loop=1
        [   13.178489] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108000, req_in_progress=0, loop=2
        [   13.188704] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108002, req_in_progress=0, loop=3
        [   13.198920] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108002, req_in_progress=0, loop=4
        [   13.209135] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108002, req_in_progress=0, loop=5
        [   13.219351] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108002, req_in_progress=0, loop=6
        endless looping ..............
        [   26.754257] omap_hsmmc 48060000.mmc: !! mmc0: ISR, irq=34,
update status: 108002, req_in_progress=0, loop=1306
        yes, endless.....

   The source code with line number(to be a reference to the log info)
 omap_hsmmc.c:
        847   │ static void
        848   │ omap_hsmmc_start_command(struct omap_hsmmc_host *host,
struct mmc_command *cmd,
        849   │     struct mmc_data *data)
        850   │ {
        851   │     int cmdreg = 0, resptype = 0, cmdtype = 0;
        852   │
        853   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument 0x%08x\n",
        854   │         mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        855   │     host->cmd = cmd;
        856   │
        857   │     omap_hsmmc_enable_irq(host, cmd);
        858   │
        859   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        860   │
        861   │     host->response_busy = 0;
        862   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        863   │
        864   │     if (cmd->flags & MMC_RSP_PRESENT) {
        865   │         if (cmd->flags & MMC_RSP_136){
        866   │             resptype = 1;
        867   │             dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        868   │         }else if (cmd->flags & MMC_RSP_BUSY) {
        869   │             resptype = 3;
        870   │             host->response_busy = 1;
        871   │             dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        872   │         } else {
        873   │             resptype = 2;
        874   │             dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        875   │         }
        876   │     }
        877   │
        878   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        879   │
        880   │     /*
        881   │      * Unlike OMAP1 controller, the cmdtype does not
seem to be based on
        882   │      * ac, bc, adtc, bcr. Only commands ending an open
ended transfer need
        883   │      * a val of 0x3, rest 0x0.
        884   │      */
        885   │     if (cmd == host->mrq->stop)
        886   │         cmdtype = 0x3;
        887   │
        888   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        889   │
        890   │     cmdreg = (cmd->opcode << 24) | (resptype << 16) |
(cmdtype << 22);
        891   │
        892   │     if ((host->flags & AUTO_CMD23) &&
mmc_op_multi(cmd->opcode) &&
        893   │         host->mrq->sbc) {
        894   │         cmdreg |= ACEN_ACMD23;
        895   │
        896   │         dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        897   │
        898   │         OMAP_HSMMC_WRITE(host->base, SDMASA,
host->mrq->sbc->arg);
        899   │
        900   │         dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        901   │     }
        902   │     if (data) {
        903   │         dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        904   │
        905   │         cmdreg |= DP_SELECT | MSBS | BCE;
        906   │         if (data->flags & MMC_DATA_READ)
        907   │             cmdreg |= DDIR;
        908   │         else
        909   │             cmdreg &= ~(DDIR);
        910   │
        911   │         dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d,
argument 0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode,
cmd->arg);
        912   │     }
        913   │
        914   │     if (host->use_dma)
        915   │         cmdreg |= DMAE;
        916   │
        917   │     host->req_in_progress = 1;
        918   │
        919   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        920   │
        921   │     OMAP_HSMMC_WRITE(host->base, ARG, cmd->arg);
        922   │
        923   │     dev_vdbg(mmc_dev(host->mmc), "%s: CMD%d, argument
0x%08x, tracing\n", mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        924   │
        925   │     OMAP_HSMMC_WRITE(host->base, CMD, cmdreg);
        926   │
        927   │     if(6 == cmd->opcode){
        928   │ #define my_dynamic_dev_dbg(dev, fmt, ...)               \
        929   │         do {                            \
        930   │             DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, fmt); \
        931   │             if (DYNAMIC_DEBUG_BRANCH(descriptor)){      \
        932   │                 __dynamic_dev_dbg(&descriptor, dev, fmt, \
        933   │                           ##__VA_ARGS__);   \
        934   │                 dump_stack();               \
        935   │             }                       \
        936   │         } while (0)
        937   │
        938   │         my_dynamic_dev_dbg(mmc_dev(host->mmc), "%s:
CMD%d, tracing\n",
        939   │                    mmc_hostname(host->mmc), cmd->opcode);
        940   │ #undef my_dynamic_dev_dbg
        941   │     }
        942   │ }
       ................
       1200   │ static irqreturn_t omap_hsmmc_irq(int irq, void *dev_id)
       1201   │ {
       1202   │     struct omap_hsmmc_host *host = dev_id;
       1203   │     int status;
       1204   │     unsigned int  loop_count = 0;
       1205   │
       1206   │     status = OMAP_HSMMC_READ(host->base, STAT);
       1207   │     dev_vdbg(mmc_dev(host->mmc), "%s: enter ISR,
irq=%d, status=%x\n", mmc_hostname(host->mmc), irq, status);
       1208   │
       1209   │     while (status & (INT_EN_MASK | CIRQ_EN)) {
       1210   │         loop_count++;
       1211   │
       1212   │         if (host->req_in_progress)
       1213   │             omap_hsmmc_do_irq(host, status);
       1214   │
       1215   │         if (status & CIRQ_EN)
       1216   │             mmc_signal_sdio_irq(host->mmc);
       1217   │
       1218   │         /* Flush posted write */
       1219   │         status = OMAP_HSMMC_READ(host->base, STAT);
       1220   │         if(status){
       1221   │             dev_warn(mmc_dev(host->mmc), "!! %s: ISR,
irq=%d, update status: %x, req_in_progress=%d, loop=%u\n",
       1222   │                  mmc_hostname(host->mmc), irq, status,
host->req_in_progress, loop_count);
       1223   │         }
       1224   │     }
       1225   │
       1226   │     dev_vdbg(mmc_dev(host->mmc), "%s: leave ISR,
irq=%d\n", mmc_hostname(host->mmc), irq);
       1227   │     return IRQ_HANDLED;
       1228   │ }



    My patch to fixup that bug is here:
diff -upr linux-git/drivers/mmc/host/omap_hsmmc.c
linux-fixup/drivers/mmc/host/omap_hsmmc.c
--- linux-git/drivers/mmc/host/omap_hsmmc.c     2019-10-29
21:33:05.127638851 +0800
+++ linux-fixup/drivers/mmc/host/omap_hsmmc.c   2019-10-29
21:37:44.957637473 +0800
@@ -852,8 +852,6 @@ omap_hsmmc_start_command(struct omap_hsm
                mmc_hostname(host->mmc), cmd->opcode, cmd->arg);
        host->cmd = cmd;

-       omap_hsmmc_enable_irq(host, cmd);
-
        host->response_busy = 0;
        if (cmd->flags & MMC_RSP_PRESENT) {
                if (cmd->flags & MMC_RSP_136)
@@ -891,7 +889,8 @@ omap_hsmmc_start_command(struct omap_hsm
        if (host->use_dma)
                cmdreg |= DMAE;

-       host->req_in_progress = 1;
+       host->req_in_progress = 1; /* the flag must be set BEFORE irq enabled */
+       omap_hsmmc_enable_irq(host, cmd);

        OMAP_HSMMC_WRITE(host->base, ARG, cmd->arg);
        OMAP_HSMMC_WRITE(host->base, CMD, cmdreg);



--
Brock Zheng <yzheng@xxxxxxxxxxxxx>
郑 祎




[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux