On Fri, Oct 19, 2018 at 04:19:39PM +0300, Adrian Hunter wrote: > On 19/10/18 12:26 PM, Anisse Astier wrote: > > Hi Adrian, > > > > On Fri, Oct 19, 2018 at 10:07:38AM +0300, Adrian Hunter wrote: > >> On 18/10/18 1:21 PM, Anisse Astier wrote: > >>> If we don't have the voltage switch DSM methods available, there's no > >>> point in advertising to the rest of the kernel that we support 1.8V, or > >>> support voltage switch at all. > >>> > >>> This fixes an issue on a Gemini Lake (GLK) laptop : eMMC driver will > >>> timeout on boot (from 60seconds to 10minutes ) as the cqhci attempts CQE > >>> recovery after a failed voltage switch. In earlier kernels, the problem > >>> existed, but only delayed boot for about 10 seconds after an I/O error, > >>> allowing booting on the eMMC (almost) unnoticed. > >> > >> Can you send the kernel messages? Which kernel is it? Which laptop? An > >> acpidump might help too. > > > > You're right, I should have started with this. I have attached various > > dmesg traces: > > - dmesg-4.18.3-CQE-traces.txt : the original issue that was > > encountered, it shows the multiple CQE recovery timeouts, each taking > > about 60s > > - dmesg-4.19-rc8.noquirk.txt : a boot where the CQE recovery works, and > > only an I/O error is shown. I've reduced it to the mmc/sdhci traces. > > - dmesg-4.19-rc8.noquirk-with-error.txt : a boot where it fails, this > > is similar to the first one, but with a more recent kernel > > - dmesg-4.19-rc8.withquirk.txt : a boot with sdhci.debug_quirks2=0x90c > > on the command line. I've added the SDHCI_QUIRK2_NO_1_8_V quirk to > > the other ones present in the driver. You can see there's no CQE > > recovery or mmc I/O errors. > > > > > > I've reproduced the issue with linux 4.17, 4.18 and 4.19-rc8. The laptop > > is a noname laptop with an Insyde EFI firmware (Manufacturer: Notebook, > > Product Name: N75_77GU). > > > > You'll find the acpidump in the attachment. I've looked into another GLK > > laptop's tables, and the implemented acpi methods seem to only do a > > sleep(), which isn't really helpful. I've tried to add a similmar msleep > > in the voltage_switch function, but it didn't seem to help. > > Please try disabling CQE i.e. > > diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c > index e53333c695b3..c0f8348f75f7 100644 > --- a/drivers/mmc/host/sdhci-pci-core.c > +++ b/drivers/mmc/host/sdhci-pci-core.c > @@ -732,7 +732,7 @@ static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot) > { > int ret = byt_emmc_probe_slot(slot); > > - slot->host->mmc->caps2 |= MMC_CAP2_CQE; > + //slot->host->mmc->caps2 |= MMC_CAP2_CQE; > > if (slot->chip->pdev->device != PCI_DEVICE_ID_INTEL_GLK_EMMC) { > slot->host->mmc->caps2 |= MMC_CAP2_HS400_ES, > I did that, and while we don't have the long CQE timeouts, we still have an I/O error: [ 0.468934] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xe0000000-0xe3ffffff] (base 0xe0000000) [ 0.468934] PCI: MMCONFIG at [mem 0xe0000000-0xe3ffffff] reserved in E820 [ 0.621318] acpi PNP0A08:00: [Firmware Info]: MMCONFIG for domain 0000 [bus 00-3f] only partially covers this bridge [ 3.727365] sdhci: Secure Digital Host Controller Interface driver [ 3.727365] sdhci: Copyright(c) Pierre Ossman [ 3.729638] sdhci-pci 0000:00:1c.0: SDHCI controller found [8086:31cc] (rev 3) [ 3.731801] mmc0: no DSM function for 1.8 voltage switch [ 3.731802] mmc0: Voltage switching unsupported [ 3.731872] mmc0: CQHCI version 5.10 [ 3.735646] mmc0: SDHCI controller on PCI [0000:00:1c.0] using ADMA 64-bit [ 3.816485] mmc0: new HS400 MMC card at address 0001 [ 3.819219] mmcblk0: mmc0:0001 M52532 29.1 GiB [ 3.819635] mmcblk0boot0: mmc0:0001 M52532 partition 1 4.00 MiB [ 3.820056] mmcblk0boot1: mmc0:0001 M52532 partition 2 4.00 MiB [ 3.820225] mmcblk0rpmb: mmc0:0001 M52532 partition 3 4.00 MiB, chardev (247:0) [ 3.823106] mmcblk0: p1 p2 p3 p4 p5 p6 [ 4.135118] bcache: register_cache() registered cache device mmcblk0p6 [ 5.627275] sr 1:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray [ 16.251559] mmc0: Timeout waiting for hardware interrupt. [ 16.251572] mmc0: sdhci: ============ SDHCI REGISTER DUMP =========== [ 16.251581] mmc0: sdhci: Sys addr: 0x00000020 | Version: 0x00001002 [ 16.251589] mmc0: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000020 [ 16.251596] mmc0: sdhci: Argument: 0x000f4000 | Trn mode: 0x0000003b [ 16.251603] mmc0: sdhci: Present: 0x1fff0206 | Host ctl: 0x0000003d [ 16.251609] mmc0: sdhci: Power: 0x0000000b | Blk gap: 0x00000080 [ 16.251616] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007 [ 16.251623] mmc0: sdhci: Timeout: 0x00000006 | Int stat: 0x00000000 [ 16.251630] mmc0: sdhci: Int enab: 0x02ff000b | Sig enab: 0x02ff000b [ 16.251637] mmc0: sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000 [ 16.251644] mmc0: sdhci: Caps: 0x546ec881 | Caps_1: 0x80000807 [ 16.251651] mmc0: sdhci: Cmd: 0x0000123a | Max curr: 0x00000000 [ 16.251658] mmc0: sdhci: Resp[0]: 0x00000800 | Resp[1]: 0x00000000 [ 16.251665] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000900 [ 16.251671] mmc0: sdhci: Host ctl2: 0x0000000d [ 16.251679] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x0000000173126200 [ 16.251682] mmc0: sdhci: ============================================ [ 16.252462] mmc0: mmc_hs400_to_hs200 failed, error -84 [ 16.253487] mmcblk0: error -84 requesting status [ 16.253856] mmc0: mmc_hs400_to_hs200 failed, error -84 [ 16.254111] mmc0: cache flush error -84 [ 16.370768] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null) [ 17.310793] EXT4-fs (mmcblk0p2): re-mounted. Opts: errors=remount-ro [ 17.632001] bcache: register_bcache() error /dev/mmcblk0p6: device already registered [ 18.135208] Adding 524284k swap on /dev/mmcblk0p3. Priority:-2 extents:1 across:524284k SSFS [ 18.818304] EXT4-fs (mmcblk0p4): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 18.818848] EXT4-fs (mmcblk0p5): mounted filesystem with ordered data mode. Opts: (null) Anisse