On Fri 26 Jul 06:48 PDT 2019, Mark Brown wrote: > On Fri, Jul 26, 2019 at 05:18:01AM -0700, kernelci.org bot wrote: > > The past few versions of -next failed to boot on apq8096-db820c: > > > defconfig: > > gcc-8: > > apq8096-db820c: 1 failed lab > > with an RCU stall towards the end of boot: > > 00:03:40.521336 [ 18.487538] qcom_q6v5_pas adsp-pil: adsp-pil supply px not found, using dummy regulator > 00:04:01.523104 [ 39.499613] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > 00:04:01.533371 [ 39.499657] rcu: 2-...!: (0 ticks this GP) idle=9ca/1/0x4000000000000000 softirq=1450/1450 fqs=50 > 00:04:01.537544 [ 39.504689] (detected by 0, t=5252 jiffies, g=2425, q=619) > 00:04:01.541727 [ 39.513539] Task dump for CPU 2: > 00:04:01.547929 [ 39.519096] seq R running task 0 199 198 0x00000000 > > Full details and logs at: > > https://kernelci.org/boot/id/5d3aa7ea59b5142ba868890f/ > > The last version that worked was from the 15th and there seem to be > similar issues in mainline since -rc1. As you might have seen this problem has come and gone on the apq8096-db820c and I've finally managed to narrow it down a little bit. The problem first appears on next-20190701, with the introduction of CONFIG_RANDOMIZE_BASE in the defconfig, but after further efforts I've concluded that disabling kpti removes or hides the problem. With kpti=no on the command line I've now successfully booted the db820c 100+ times without problems (a clear improvement from the 75% failure rate with kpti=yes). Unfortunately I'm not yet certain why this is causing issues and I'm also seeing the same rcu stall on SDA845 under certain (erroneous?) conditions (where I don't expect them). Regards, Bjorn