Hi, On Tue, Jul 9, 2019 at 5:02 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > On Mon, 8 Jul 2019 at 23:12, Doug Anderson <dianders@xxxxxxxxxxxx> wrote: > > > > Hi, > > > > On Mon, Jul 8, 2019 at 3:54 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > > > On Thu, 4 Jul 2019 at 02:02, Doug Anderson <dianders@xxxxxxxxxxxx> wrote: > > > > > > > > Hi, > > > > > > > > On Tue, Jun 18, 2019 at 8:35 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > > > > > > > > To use the so called powered-on re-initialization of an SDIO card, the > > > > > power to the card must obviously have stayed on. If not, the initialization > > > > > will simply fail. > > > > > > > > > > In the runtime suspend case, the card is always powered off. Hence, let's > > > > > drop the support for powered-on re-initialization during runtime resume, as > > > > > it doesn't make sense. > > > > > > > > > > Moreover, during a HW reset, the point is to cut the power to the card and > > > > > then do fresh re-initialization. Therefore drop the support for powered-on > > > > > re-initialization during HW reset. > > > > > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > > > > > --- > > > > > drivers/mmc/core/sdio.c | 8 +------- > > > > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > > > > > This has been on my list of things to test for a while but I never > > > > quite got to it... > > > > > > > > ...and then, today, I spent time bisecting why the "reset" > > > > functionality of miwfiex is broken on my 4.19 kernel [1]. AKA, this > > > > is broken: > > > > > > > > cd /sys/kernel/debug/mwifiex/mlan0 > > > > echo 1 > reset > > > > > > > > I finally bisected the problem and tracked it down to commit > > > > ca8971ca5753 ("mmc: dw_mmc: Prevent runtime PM suspend when SDIO IRQs > > > > are enabled"), which embarrassingly has my Tested-by on it. I guess I > > > > never tested the Marvell reset call. :-/ > > > > > > > > I dug a little and found that when the Marvell code did its reset we > > > > ended up getting a call to dw_mci_enable_sdio_irq(enb=0) and never saw > > > > a dw_mci_enable_sdio_irq(enb=1) after. I tracked it down further and > > > > found that specifically it was the call to mmc_signal_sdio_irq() in > > > > mmc_sdio_power_restore() that was making the call. The call stack > > > > shown for the "enb=0" call: > > > > > > > > [<c071a290>] (dw_mci_enable_sdio_irq) from [<c070a960>] > > > > (mmc_sdio_power_restore+0x98/0xc0) > > > > [<c070a960>] (mmc_sdio_power_restore) from [<c070a9b4>] > > > > (mmc_sdio_reset+0x2c/0x30) > > > > [<c070a9b4>] (mmc_signal_sdio_irq) from [<c06ff160>] (mmc_hw_reset+0xbc/0x138) > > > > [<c06ff160>] (mmc_hw_reset) from [<bf1bbad8>] > > > > (mwifiex_sdio_work+0x5d4/0x678 [mwifiex_sdio]) > > > > [<bf1bbad8>] (mwifiex_sdio_work [mwifiex_sdio]) from [<c0247cd0>] > > > > (process_one_work+0x290/0x4b4) > > > > > > > > I picked your patch here (which gets rid of the call to > > > > mmc_signal_sdio_irq()) and magically the problem went away because > > > > there is no more call to mmc_signal_sdio_irq(). > > > > > > > > I personally don't have lots of history about the whole > > > > "powered_resume" code path. I checked and mmc_card_keep_power() was 0 > > > > in my test case of getting called from hw_reset, so the rest of this > > > > patch doesn't affect me at all. This surprised me a little since I > > > > saw "MMC_PM_KEEP_POWER" being set in mwifiex but then I realized that > > > > it was only set for the duration of suspend and then cleared by the > > > > core. ;-) > > > > > > > > I will also say that I don't have any test case or knowledge of how > > > > SDIO runtime suspend/resume is supposed to work since on dw_mmc SDIO > > > > cards are currently not allowed to runtime suspend anyway. ;-) > > > > > > > > > > > > So I guess the result of all that long-winded reply is that for on > > > > rk3288-veyron-jerry: > > > > > > > > Fixes: ca8971ca5753 ("mmc: dw_mmc: Prevent runtime PM suspend when > > > > SDIO IRQs are enabled") > > > > Tested-by: Douglas Anderson <dianders@xxxxxxxxxxxx> > > > > > > Thanks a lot for testing and for your detailed feedback. I have > > > amended the patch by adding your tags above. > > > > Sure! I'm going to try to do some detailed testing on the next patch > > too to confirm it's OK, but I have a few other tasks to get to first. > > ;-) > > > > > > > Moreover, we seems to need this for stable as well, but I am leaving > > > that to be managed as a separate task. We may even consider the hole > > > series for stable, but that requires more testing first. > > > > Sure, makes sense. We'll pick it to the Chrome OS 4.19 kernel > > directly but it's usually nice to get fixes like this into stable so > > everyone can benefit. > > > > > > > > One last note is that, though Marvell WiFi works after a reset after > > > > this commit, Marvell Bluetooth (on the same SDIO module) doesn't. I > > > > guess next week it'll be another bisect... > > > > > > Is the Bluetooth connected to the same SDIO interface, thus the > > > Bluetooth driver is an SDIO func driver? > > > > Yes, it's a SDIO func driver connected to the same interface. So far > > I've managed to confirm the problem on: > > > > v4.4 (plus 76ae3e26ea43 mwifiex: add debugfs file for testing reset of card) > > v4.9 > > next-20190708 > > > > ...so it seems like it's not a "regression", it's just never worked. > > :-P I guess I'll have to see if I can figure out what's different in > > our chromeos-3.14 kernel. Ah, I see. In 3.14 we had this solution: > > > > pr_err("Resetting card...\n"); > > mmc_remove_host(reset_host); > > /* 200ms delay is based on experiment with sdhci controller */ > > mdelay(200); > > reset_host->rescan_entered = 0; > > mmc_add_host(reset_host); > > > > ...I think that didn't fly upstream. ...but I can confirm that this works fine: > > > > cd /sys/bus/platform/drivers/dwmmc_rockchip > > echo ff0d0000.dwmmc > unbind > > sleep .5 > > echo ff0d0000.dwmmc > bind > > > > ...so I guess this boils down to: how does the mwifiex reset code not > > behave like a full removal and re-insertion of the card? Oh, but > > maybe that's obvious. We're doing all the reset / re-init from the > > WiFi side of things (mwifiex_sdio_card_reset_work) but I don't think > > anything is communicated to the Bluetooth side of things. Presumably > > this is just totally broken for everyone? ...or am I confused? > > Nope, that is most likely what is happening. > > I am not sure what is the best method to deal with this. Perhaps we > should invent some callback the SDIO core code can call, for each > active SDIO func on the particular SDIO card that becomes reset. > > Or is there a better way you think? I didn't get a chance to fully dig today, but I keep thinking that the cleanest way would be if I could somehow tell the MMC core to pretend to unplug the card and then re-plug it in. Then we could go through all the standard code paths. I remember the last time I looked for a nice way to do that I couldn't find one. ...barring that I wonder if it's enough to just remove and re-add all the active SDIO funcs? -Doug