Hi On 2016-5-18 12:12, Doug Anderson wrote: > Hi, > > On Tue, May 17, 2016 at 6:59 PM, Shawn Lin > <shawn.lin at kernel-upstream.org> wrote: >> Could you try this patch to see if you can still find HLE? >> >> @@ -2356,12 +2356,22 @@ static void dw_mci_cmd_interrupt(struct dw_mci >> *host, u32 status) >> static void dw_mci_handle_cd(struct dw_mci *host) >> { >> int i; >> + int present; >> >> for (i = 0; i < host->num_slots; i++) { >> struct dw_mci_slot *slot = host->slot[i]; >> >> if (!slot) >> continue; >> >> + present = !(mci_readl(slot->host, CDETECT) & (1 << >> slot->id)); >> + if (present) >> + set_bit(DW_MMC_CARD_PRESENT, &slot->flags); >> + else >> + clear_bit(DW_MMC_CARD_PRESENT, &slot->flags); > > No, because we don't use the builtin card detect on veyron. ;) > > We use GPIO card detect because we didn't like the way JTAG and SD > interacted. Also on rk3288 the builtin card detect line had the wrong > voltage domain (you couldn't detect a card when the IO lines were > powered off). The builtin card detect line is always driven low on > veyron. Okay, I see. > > > I'm nearly certain that the root cause of my HLE errors is actually > related to the same problem addressed by the commit 7c5209c315ea > ("mmc: core: Increase delay for voltage to stabilize from 3.3V to > 1.8V"). I think that on minnie we're still on the hairy edge and > sometimes the line doesn't transition fast enough. Things are not so simple from your details. I was not enabling SD3.0 support, then I also found HLE sometimes. So it seems commit 7c5209c315ea does not contibute to this phenomenon. The scenario looks like: remove sd-card -> mmc_sd_detect -> send status(CMD13) ->power_off -> set_ios -> setup_bus -> disabled clk , then HLE irq storm coming From the code of dw_mci_prepare_command: SDMMC_CMD_PRV_DAT_WAIT will not be used for CMD13, so we don't wait_busy here, then cmd code is loding into queue of dw_mmc but still failing send out because it's in busy? With my patch, things go well: remove sd-card -> clear bit of DW_MMC_CARD_PRESENT -> send status(CMD13) return directly -> power_off -> set_ios -> setup_bus -> disable clk So why should we allow inquiry of card status if we sure the card is removed? I mean no any further cmds should be delivered. And another question: should we wait busy for cmd13? > > It appears that increasing this to 30ms avoids the HLE errors. > > I _think_ I can actually fully fix this properly by temporarily > engaging the internal pull-ups while the voltage switch is happening. > This will bleed away the voltage just a little bit faster (since lines > are driven low here). I'll try to confirm that. > > > In any case, it seems like we should take this patch since (without > this patch) the failure case when you get HLE errors is that the > interrupt controller fires over and over again (with no printouts) and > your system stalls with no error messages. Sure, at least we need to address this irq storm... > > -Doug > > >