Hi, On Tue, May 28, 2019 at 12:22 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > On Mon, 29 Apr 2019 at 22:41, Douglas Anderson <dianders@xxxxxxxxxxxx> wrote: > > > > Processing SDIO interrupts while dw_mmc is suspended (or partly > > suspended) seems like a bad idea. We really don't want to be > > processing them until we've gotten ourselves fully powered up. > > I fully agree. > > Although, this is important not only from the host driver/controller > perspective, but also from the SDIO card (managed by mmc core) point > of view. > > In $subject patch you mange the driver/controller issue, but only for > one specific host driver (dw_mmc). I am thinking that this problem may > be a rather common problem, so perhaps we should try to address this > from the core in a way that it affects all host drivers. Did you > consider that? I did wonder that. See below, but in general I don't have massive experience with all the host controllers out there. Looking at sdhci code, though, it might not have the same problems? At least in some cases it fully turns off the interrupts. In other cases it seems like the controller itself keeps power and so maybe getting the SDIO interrupts early is OK? > The other problem I refer to, is in principle a way to prevent > sdio_run_irqs() from being executed before the SDIO card has been > resumed, via mmc_sdio_resume(). It's a separate problem, but certainly > related. This may need some more thinking to address properly, let's > just keep this in mind and discuss this in a separate thread. Actually, I think if we could figure out how to do this well it might solve my particular problem. Specifically I don't believe that running dw_mmc's interrupt handler itself is causing the problem (though it does seem pretty odd to run it while we're in the middle of initting the host), I think it's the SDIO driver calling back into us that's causing the problems. > > You might be wondering how it's even possible to become suspended when > > an SDIO interrupt is active. As can be seen in > > dw_mci_enable_sdio_irq(), we explicitly keep dw_mmc out of runtime > > suspend when the SDIO interrupt is enabled. ...but even though we > > stop normal runtime suspend transitions when SDIO interrupts are > > enabled, the dw_mci_runtime_suspend() can still get called for a full > > system suspend. > > > > Let's handle all this by explicitly masking SDIO interrupts in the > > suspend call and unmasking them later in the resume call. To do this > > cleanly I'll keep track of whether the client requested that SDIO > > interrupts be enabled so that we can reliably restore them regardless > > of whether we're masking them for one reason or another. > > > > It should be noted that if dw_mci_enable_sdio_irq() is never called > > (for instance, if we don't have an SDIO card plugged in) that > > "client_sdio_enb" will always be false. In those cases this patch > > adds a tiny bit of overhead to suspend/resume (a spinlock and a > > read/write of INTMASK) but other than that is a no-op. The > > SDMMC_INT_SDIO bit should always be clear and clearing it again won't > > hurt. > > Thanks for the detailed problem description. In general your approach > sounds okay to me, but I have a few questions. > > 1) As kind of stated above, did you consider a solution where the core > simply disables the SDIO IRQ in case it isn't enabled for system > wakeup? In this way all host drivers would benefit. I can give it a shot if you can give me a bunch of specific advice, but I only have access to a few devices doing anything with SDIO and they are all using Marvell or Broadcom on dw_mmc. In general I have no idea how SDIO wakeup (plumbed through the SD controller) would work. As per below the only way I've seen it done is totally out-of-band. ...and actually, I'm not sure I've actually ever seen even the out of band stuff truly work on a system myself. It's always been one of those "we should support wake on WiFi" but never made it all the way to implementation. In any case, if there are examples of people plumbing wakeup through the SD controller I'd need to figure out how to not break them. Just doing a solution for dw_mmc means I don't have to worry about this since dw_mmc definitely doesn't support SDIO wakeup. Maybe one way to get a more generic solution is if you had an idea for a patch that would work for many host controllers then you could post it and I could test to confirm that it's happy on dw_mmc? ...similar to when you switched dw_mmc away from the old kthread-based SDIO stuff? > 2) dw_mmc isn't calling device_init_wakeup() during ->probe(), hence I > assume it doesn't support the SDIO IRQ being configured as system > wakeup. Correct? I understand this is platform specific, but still it > would be good to know your view. Right, currently dw_mmc doesn't support the SDIO IRQ being configured as a system wakeup. I'm kinda curious how this works in general. Don't you need a clock running to get an SDIO interrupt? How does that work for suspend? ...I do know that I've seen some dw_mmc host controllers have an "SDIO IRQ" line that could be pinned out and that line would also assert the same SDIO interrupt, but the mainline driver has never supported it. Whenever I asked Marvell about it in the past they were always confused about what to do with that line and (if I remember correctly) we never hooked it up. I always though it would be super interesting because it seemed like it would let us disable the card clock when the slot was idle. ...as far as I was ever able to discern the pin was officially "non-standard". As far as I know SDIO cards that want to be able to wakeup the device end up using some sort of out of band mechanism. For instance "marvell,wakeup-pin" or Broadcom's "host-wake" interrupt. As per above, I don't have tons of experience here. I see that "rk3399-gru-kevin" has "marvell,wakeup-pin" defined, but that's PCIe not SDIO. Downstream in our Chrome OS kernel it seems like "mt8173-oak.dtsi" and "mt8176-rowan.dtsi" have this for SDIO but they are are devices I didn't work on and don't have much familiarity with. > 3) Because of 2) The below code in dw_mci_runtime_suspend(), puzzles me: > "if (host->slot->mmc->pm_flags & MMC_PM_KEEP_POWER)" > dw_mci_set_ios(host->slot->mmc, &host->slot->mmc->ios); > > Why is 3) needed at all in case system wakeup isn't supported? That code has been in there for a really long time, dating back to commit ab269128a2cf ("mmc: dw_mmc: Add sdio power bindings"). ...but, in general, I know that we always keep power to the SDIO card in suspend time. This doesn't take a whole lot of power and speeds up WiFi acquisition after resume by a whole lot (because otherwise we'd have to fully re-load the WiFi firmware). In fact, I believe that the Marvell WiFi driver requires it. Ah yes, search for "MMC_PM_KEEP_POWER" in "marvell/mwifiex/sdio.c" and you'll see that it gets yells if power isn't kept. If we are keeping power during suspend/resume then presumably the card will expect that communication resumes as normal upon resume. AKA: the clock should come up at the expected rate / voltage level and we should resume as normal. > A note; the current support in the mmc core for the SDIO IRQ being > used as system wakeup, really needs some re-work. For example, we > should convert to use common wakeup interfaces, as to allow the PM > core to behave correctly during system suspend/resume. These are > changes that have been scheduled on my TODO list since long time ago, > I hope I can get some time to look into them soon. Personally my knowledge of SD / SDIO is mostly acquired by trying to make the dw_mmc driver do what we've needed it to over the years, so I don't actually have a ton of broad understanding of the SDIO spec and what is generally done for host controllers / SDIO cards. If you're aware of standard ways to get an SDIO IRQ to work in suspend time then that'd be interesting. > > Without this fix it can be seen that rk3288-veyron Chromebooks with > > Marvell WiFi would sometimes fail to resume WiFi even after picking my > > recent mwifiex patch [1]. Specifically you'd see messages like this: > > mwifiex_sdio mmc1:0001:1: Firmware wakeup failed > > mwifiex_sdio mmc1:0001:1: PREP_CMD: FW in reset state > > > > ...and tracing through the resume code in the failing cases showed > > that we were processing a SDIO interrupt really early in the resume > > call. > > > > NOTE: downstream in Chrome OS 3.14 and 3.18 kernels (both of which > > support the Marvell SDIO WiFi card) we had a patch ("CHROMIUM: sdio: > > Defer SDIO interrupt handling until after resume") [2]. Presumably > > this is the same problem that was solved by that patch. > > Seems reasonable. Anyway, let me know what you think the next set of steps ought to be. It would be sorta nice to get suspend/resume working reliably and land this patch, but if you think that will impede our ability to come up with a more generic solution then I guess we don't need to land it... -Doug