On Mon 28 Oct 12:11 PDT 2019, Mark Brown wrote: > On Mon, Oct 28, 2019 at 11:40:19AM -0700, Bjorn Andersson wrote: > > On Mon 28 Oct 10:48 PDT 2019, Mark Brown wrote: > > > On Mon, Oct 28, 2019 at 08:03:08AM -0700, kernelci.org bot wrote: > > > > Today's -next (anf Friday's) fails to boot on db820c: > > > > > defconfig: > > > > gcc-8: > > > > apq8096-db820c: 1 failed lab > > > > It looks like it deadlocks somewhere, the last things in the log are a > > > failure to start ufshcd-qcom and then an RCU stall some time later: > > > db820c has been failing intermittently for a while now, it seems that > > booting with kpti enabled causes something to go wrong. There are > > nothing strange in the kernel logs and ftrace seems to indicate that all > > the CPUs are idling nicely. > > Oh dear. Adding Catalin and Will. Is it definitely KPTI that's > triggering stuff? It did turn up some bugs on other systems, though > it's a bit strange it's only manifesting in KernelCI... I did a test recently where I booted my db820c 100 times with kpti=yes and 100 times with kpti=no on the kernel command line, and the result was 90% failure to reach console vs 0%. Going back and looking at the logs for the 10% indicated that the boot CPU was fine, but I had stalls reported on other CPUs. In an effort to rule out driver bugs I reduced the DT to CPUs, the core clocks, gic, timers and serial driver, and I still saw the problem. I have not looked at this with jtag and hence do not know what secure world is doing. Regards, Bjorn