On Sun, Apr 10, 2022 at 09:33:43PM +1000, Michael Ellerman wrote: > Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> writes: > > On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > >> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote: > >> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote: > ... > >> > > I haven't seen it in my testing. But using Miguel's config I can > >> > > reproduce it seemingly on every boot. > >> > > > >> > > For me it bisects to: > >> > > > >> > > 35de589cb879 ("powerpc/time: improve decrementer clockevent processing") > >> > > > >> > > Which seems plausible. > >> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer > >> > clockevent processing") > ... > >> > >> > > Reverting that on mainline makes the bug go away. > > >> > I also revert that on the mainline, and am currently doing a pressure > >> > test (by repeatedly invoking qemu and checking the console.log) on PPC > >> > VM in Oregon State University. > > > After 306 rounds of stress test on mainline without triggering the bug > > (last for 4 hours and 27 minutes), I think the bug is indeed caused by > > 35de589cb879 ("powerpc/time: improve decrementer clockevent > > processing") and stop the test for now. > > Thanks for testing, that's pretty conclusive. > > I'm not inclined to actually revert it yet. > > We need to understand if there's actually a bug in the patch, or if it's > just exposing some existing bug/bad behavior we have. The fact that it > only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious. > > Do we have some code that inadvertently relies on something enabled by > HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ? For whatever it is worth, moderate rcutorture runs to completion without errors with CONFIG_HIGH_RES_TIMERS=n on 64-bit x86. Also for whatever it is worth, I don't know of anything other than microcontrollers or the larger IoT devices that would want their kernels built with CONFIG_HIGH_RES_TIMERS=n. Which might be a failure of imagination on my part, but so it goes. Thanx, Paul