Hello, I'm working on mainline support for the Samsung Core Primve VE LTE smartphone based on the Marvell PXA1908 SoC [1] which uses the sdhci-pxav3 mmc host driver for its internal eMMC storage, microSD card slot and SDIO WiFi/BT/FM adapter. Out of the box, the SD card works however eMMC fails probing with the following error: mmc0: SDHCI controller on d4281000.mmc [d4281000.mmc] using ADMA mmc0: Card stuck being busy! __mmc_poll_for_busy mmc0: error -110 whilst initialising MMC card mmc0: Tuning failed, falling back to fixed sampling clock mmc0: Card stuck being busy! __mmc_poll_for_busy mmc0: error -110 whilst initialising MMC card mmc0: Card stuck being busy! __mmc_poll_for_busy mmc0: error -110 whilst initialising MMC card mmc0: Card stuck being busy! __mmc_poll_for_busy mmc0: error -110 whilst initialising MMC card mmc0: Failed to initialize a non-removable card This error does not occur when the card is forced to DDR_1_8V but if caching is not force-disabled (by always returning false from _mmc_cache_enabled), the following errors occur: [ 0.210673] mmc0: SDHCI controller on d4281000.mmc [d4281000.mmc] using ADMA [ 0.301489] mmc0: new DDR MMC card at address 0001 [ 0.302156] mmcblk0: mmc0:0001 QNW00A 7.28 GiB [ 0.308976] mmcblk0boot0: mmc0:0001 QNW00A 4.00 MiB [ 0.310318] mmcblk0boot1: mmc0:0001 QNW00A 4.00 MiB [ 0.311571] mmcblk0rpmb: mmc0:0001 QNW00A 512 KiB, chardev (248:0) [ 171.353426] mmc0: Card stuck being busy! __mmc_poll_for_busy [ 171.353443] mmc0: cache flush error -110 [ 201.399007] mmc0: Card stuck being busy! __mmc_poll_for_busy [ 201.399023] mmc0: cache flush error -110 [ 231.415093] mmc0: Card stuck being busy! __mmc_poll_for_busy [ 231.415109] mmc0: cache flush error -110 [ 261.464688] mmc0: Card stuck being busy! __mmc_poll_for_busy [ 261.464703] mmc0: cache flush error -110 In addition, the SDIO is unstable, sometimes it doesn't even get through firmware loading reporting -EILSEQ before the transfer is complete and even if it does initialize successfully, putting stronger load on the WiFi for instance often leads to errors and sometimes card removal, after which reboot is necessary for it to work again. Note that apart from this instability both the WiFi and the BT do work normally. However, as the support for the SDIO was added to the mwifiex driver rather naively by myself [5] (you can also find the various WiFi errors I am seeing here), I am not confident that it is fully compatible (the vendor kernel has a standalone driver for it which however appears very similar to mwifiex) and thus don't know if these WiFi issues are related to the eMMC issues or are separate. The former would seem plausible which is why I am trying to make the eMMC work first to see if it also fixes the WiFi rather than spending time with the mwifiex code when the issue might lie elsewhere. To see what the problem might be mainline, I have experimented with the v3.14.27-based vendor kernel [2], where the eMMC works out of the box obviously, trying to strip it of all the additional "features" missing mainline that seemed like they could affect the functionality of the card, in particular I have * removed all the quirks specified in the DT, * removed the sdhci-pxav3.c code that made it treat the compatibles differently (as the compatible used downstream is not available mainline), * removed or disabled the tuning code (both for the pxav3 driver which again contains a lot of things not available mainline and the generic sdhci.c tuning), * removed all the sdhci_ops members not available mainline, * removed all the clocks that downstream uses for the mmc in addition to what mainline has, both from the DT and from the corresponding drivers, * removed several regulators from the DT, effectively bringing the downstream code to a comparable state as the mainline is in (at least as far as sdhci-pxav3 is concerned). This has however not hindered its ability to probe eMMC in the HS200 mode. At this point thus, I don't know where to look for the problem other than the mmc core, or possibly still sdhci.c. The SDIO instability would suggest to me that perhaps the issue really is wrong clocking, however with the downstream removal mentioned above, the two kernels should be on-par in this regard with the additional clocks being preset by the bootloader. I have also looked through Marvell's fork of the old kernel [3] which contains the commit history where I found many changes touching drivers/mmc that seemed like they could be the missing piece, reverting any of them however also didn't affect downstream's ability to probe the eMMC. Apart from the one mentioned in the beginning of this email, there are several other hacks that enable mainline to probe the eMMC. The first one is setting card->ext_csd.cache_size = 0 right before cache is turned on in drivers/mmc/core/mmc.c. The other thing I noticed is that what __mmc_poll_for_busy checks for on mainline via host->ops->card_busy() (which means sdhci_card_busy): !(present_state & SDHCI_DATA_0_LVL_MASK) differs from what the downstream code checks for: R1_CURRENT_STATE(status) == R1_STATE_PRG and changing this mainline does indeed make it probe, however changing in turn the downstream code to use the mainline busy detection mechanism again does not break its ability to probe eMMC which makes me think that this is not the real problem. (Downstream also has SDHCI_DATA_LVL_MASK in sdhci_card_busy instead and using that mainline does remove the ETIMEDOUT error but causes some operations concerning the card (such as running mkfs.ext4 on the eMMC) to run longer (while the read/write speed isn't affected however). Plus, this function never runs downstream.) With either of the hacks, the eMMC does probe but dmesg gets spammed with lots of Tuning failed, falling back to fixed sampling clock originating from sdhci.c's __sdhci_execute_tuning(). However also with neither of the hacks does the SDIO start working properly, i. e. become stable. You can find some of my progress notes also here [4]. I will be very grateful for any help, be it a pointer or an idea regarding what the problem might be or what else I could try with either the vendor kernel or mainline, as I am very much at a loss at the moment. If there is any more information that I should provide, please do let me know. [1] https://lore.kernel.org/r/20241104-pxa1908-lkml-v13-0-e050609b8d6c@xxxxxxxx/ [2] https://github.com/CoderCharmander/g361f-kernel/ [3] https://github.com/acorn-marvell/brillo_pxa_kernel [4] https://gitlab.com/LegoLivesMatter/linux/-/issues/2 [5] https://lore.kernel.org/r/20231029111807.19261-1-balejk@xxxxxxxxx/ Thank you very much and best regards, K. B.