On Tue, 13 Jun 2023 at 22:34, Linus Walleij <linus.walleij@xxxxxxxxxx> wrote: > > This series fixes a pretty serious problem in the MMCI > busy detect handling, discovered only after going up and > down a ladder of refactorings. > > The code is written expecting the Ux500 busy detect > to fire two interrupts: one at the start of the busy > signalling and one at the end of the busy signalling. > > The root cause of the problem seen on some devices > is that only the first IRQ arrives, and then the device > hangs, waiting perpetually for the next IRQ to arrive. > > This is eventually solved by adding a timeout using > a delayed work that fire after a timeout if the busy detect > has not stopped. This is the last patch in the series. > > I included the rewrite of the entire busy detect logic > to use a state machine as this makes it way easier to > debug and will print messages about other error > conditions as well. > > The problem affects especially the Skomer > (Samsung GT-I9070) and Kyle (Samsung SGH-I407). > > It is fine to just apply this for the -next kernel, > despite it fixes the busy detect that has been broken > for these devices for a while, we can think about > backporting a simpler version of the timeout for > stable kernels if we want. > > Signed-off-by: Linus Walleij <linus.walleij@xxxxxxxxxx> > --- > Changes in v4: > - Fix an unrelated change in patch 1 > - Move MMCI_BUSY_DONE initialization outside the if()-clause > for busy detection. > - Use the per-command ->busy_timeout as calculated by the > core. > - Link to v3: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v3-0-cd3d5925ae64@xxxxxxxxxx > > Changes in v3: > - Unconditionally assign busy_status = 0 > - Rewrite state machine states to just three > - Drop a patch that gets absorbed into another patch > - Drop patch to get busy state from the state machine, it was > fishy, based on a misunderstanding and not needed > - Link to v2: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v2-0-eeb10323b546@xxxxxxxxxx > > Changes in v2: > - Drop pointless patch nr 1 > - Unconditionally intialize some state variables > - Use a less fragile method to look for busy status when > using busy detect, should fix Yann's problem > - Link to v1: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v1-0-28ac19a74e5e@xxxxxxxxxx > > --- > Linus Walleij (10): > mmc: mmci: Clear busy_status when starting command > mmc: mmci: Unwind big if() clause > mmc: mmci: Stash status while waiting for busy > mmc: mmci: Break out error check in busy detect > mmc: mmci: Make busy complete state machine explicit > mmc: mmci: Retry the busy start condition > mmc: mmci: Use state machine state as exit condition > mmc: mmci: Use a switch statement machine > mmc: mmci: Break out a helper function > mmc: mmci: Add busydetect timeout > > drivers/mmc/host/mmci.c | 143 ++++++++++++++++++++++++++++-------- > drivers/mmc/host/mmci.h | 15 ++++ > drivers/mmc/host/mmci_stm32_sdmmc.c | 6 +- > 3 files changed, 132 insertions(+), 32 deletions(-) > --- > base-commit: 3dff3b32d4752f4a0655fad3c8669978c291ae59 > change-id: 20230405-pl180-busydetect-fix-66a0360d398a > > Best regards, > -- > Linus Walleij <linus.walleij@xxxxxxxxxx> Applied patch1->patch9 for next, thanks! Let's continue to chat a bit more about patch10, to conclude. Kind regards Uffe