Hi Shawn, On 3/2/19 1:47 AM, Shawn Lin wrote: > On 2019/3/2 0:43, Christoph Muellner wrote: >> When using direct commands (DCMDs) on an RK3399, we get spurious >> CQE completion interrupts for the DCMD transaction slot (#31): > > I didn't see it. Do you try any newer code, for instance, linux-next? I can reproduce this with all kernel versions from 4.16 up to linus/master. So all kernels with the cqhci driver (has been merged for 4.15) are affected. All I need to do to reproduce the issue is to boot the system with a root file system on the eMMC. I use a Debian stable based rootfs. > >> >> [ 931.196520] ------------[ cut here ]------------ >> [ 931.201702] mmc1: cqhci: spurious TCN for tag 31 >> [ 931.206906] WARNING: CPU: 0 PID: 1433 at >> /usr/src/kernel/drivers/mmc/host/cqhci.c:725 cqhci_irq+0x2e4/0x490 >> [ 931.206909] Modules linked in: >> [ 931.206918] CPU: 0 PID: 1433 Comm: irq/29-mmc1 Not tainted >> 4.19.8-rt6-funkadelic #1 >> [ 931.206920] Hardware name: Theobroma Systems RK3399-Q7 SoM (DT) >> [ 931.206924] pstate: 40000005 (nZcv daif -PAN -UAO) >> [ 931.206927] pc : cqhci_irq+0x2e4/0x490 >> [ 931.206931] lr : cqhci_irq+0x2e4/0x490 >> [ 931.206933] sp : ffff00000e54bc80 >> [ 931.206934] x29: ffff00000e54bc80 x28: 0000000000000000 >> [ 931.206939] x27: 0000000000000001 x26: ffff000008f217e8 >> [ 931.206944] x25: ffff8000f02ef030 x24: ffff0000091417b0 >> [ 931.206948] x23: ffff0000090aa000 x22: ffff8000f008b000 >> [ 931.206953] x21: 0000000000000002 x20: 000000000000001f >> [ 931.206957] x19: ffff8000f02ef018 x18: ffffffffffffffff >> [ 931.206961] x17: 0000000000000000 x16: 0000000000000000 >> [ 931.206966] x15: ffff0000090aa6c8 x14: 0720072007200720 >> [ 931.206970] x13: 0720072007200720 x12: 0720072007200720 >> [ 931.206975] x11: 0720072007200720 x10: 0720072007200720 >> [ 931.206980] x9 : 0720072007200720 x8 : 0720072007200720 >> [ 931.206984] x7 : 0720073107330720 x6 : 00000000000005a0 >> [ 931.206988] x5 : ffff00000860d4b0 x4 : 0000000000000000 >> [ 931.206993] x3 : 0000000000000001 x2 : 0000000000000001 >> [ 931.206997] x1 : 1bde3a91b0d4d900 x0 : 0000000000000000 >> [ 931.207001] Call trace: >> [ 931.207005] cqhci_irq+0x2e4/0x490 >> [ 931.207009] sdhci_arasan_cqhci_irq+0x5c/0x90 >> [ 931.207013] sdhci_irq+0x98/0x930 >> [ 931.207019] irq_forced_thread_fn+0x2c/0xa0 >> [ 931.207023] irq_thread+0x114/0x1c0 >> [ 931.207027] kthread+0x128/0x130 >> [ 931.207032] ret_from_fork+0x10/0x20 >> [ 931.207035] ---[ end trace 0000000000000002 ]--- >> >> The driver shows this message only for the first spurious interrupt >> by using WARN_ONCE(). Changing this to WARN() shows, that this is >> happening quite frequently (up to once a second). >> >> Since the eMMC 5.1 specification, where CQE and CQHCI are specified, >> does not mention that spurious TCN interrupts for DCMDs can be simply >> ignored, we must assume that using this feature is not working reliably. >> >> The current implementation uses DCMD for REQ_OP_FLUSH only, and >> I could not see any performance/power impact when disabling >> this optional feature for RK3399. >> >> Therefore this patch disables DCMDs for RK3399. > > We need to sort out the problem, and see if it could be solved, or > we just simply remove MMC_CAP2_CQE_DCMD it from sdhci-of-arasan I fully agree that we should address it in the driver if it would be buggy. Therefore I debugged the issue and used an event-log based on atomic_t variables to observe what is going on. And it is indeed the case that we get a second spurious interrupt (an interrupt for a slot, which has the doorbell bit not set previously) from the controller every now and then. Only slot #31 is affected (so only DCMDs). And only if DCMD support is enabled. I disagree, that we should disable it for sdhci-of-arasan (i.e. for all Arasan eMMC 5.1 based controllers), because, I cannot say that all Arasan eMMC 5.1 based implementations are affected. I only know that the one in the RK3399 is affected (mainly because I don't have access to more devices with this IP core). Therefore the series disables it for RK3399. Thanks, Christoph > >> >> Signed-off-by: Christoph Muellner >> <christoph.muellner@xxxxxxxxxxxxxxxxxxxxx> >> Signed-off-by: Philipp Tomsich <philipp.tomsich@xxxxxxxxxxxxxxxxxxxxx> >> --- >> arch/arm64/boot/dts/rockchip/rk3399.dtsi | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi >> b/arch/arm64/boot/dts/rockchip/rk3399.dtsi >> index 6cc1c9fa4ea6..1bbf0da4e01d 100644 >> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi >> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi >> @@ -333,6 +333,7 @@ >> phys = <&emmc_phy>; >> phy-names = "phy_arasan"; >> power-domains = <&power RK3399_PD_EMMC>; >> + disable-cqe-dcmd; >> status = "disabled"; >> }; >> > >