On Mon, Oct 09, 2017 at 04:47:08PM +0200, Christian Borntraeger wrote: > Paul, > > Michael J. Sullivan from our performance team told me that > commit 9111b9361f2 EXP: Fine-grained timer diagnostics breaks cpu hot unplug for him. > > I can confirm that. Interesting! This commit is not, repeat -not-, ever going upstream. It is strictly diagnostic for a timer problem that I hit from time to time in runs of rcutorture. The fact that it breaks your tests should provide a clue to me, but I am not immediately seeing what that clue might be. For whatever it is worth, here are the symptoms I am seeing: o Sometimes a process's timer is delayed, perhaps even lost. o This only happens when: o CONFIG_HOTPLUG_CPU is enabled, and o The system is booted under qemu with -smp having a much larger value than the maxcpus= kernel boot parameter. For example, "-smp 43" and "maxcpus=8", which causes the first eight CPUs to be brought online immediately at boot and the remaining 35 to be brought online some time later. o Providing redundant periodic wakeups to the affected processes masks the problem. o Affinitying the affected processes away from outgoing CPUs does not fix the problem. (No surprise, as these processes might have previously posted a timeout to an outgoing CPU.) o A rather long list of hypotheses as to the nature of the bug have all proven to be wrong. o The problem takes quite some time to play out, so I have been thus far unable to get an intact trace of the problem. Then again, I have been distracted by other issues for the past few weeks. > A crash dump show (minus the removed CPU) the following: > > [cborntra@s38lp08 linux]$ cat e > PID: 0 TASK: c41800 CPU: 0 COMMAND: "swapper/0" > LOWCORE INFO: > -psw : 0x0404e00180000000 0x0000000000850226 > -function : arch_spin_lock_wait at 850226 > -prefix : 0x7fffc000 > -cpu timer: 0x7fffffece131fc54 > -clock cmp: 0x42e72b784707c600 > #0 [7fffbbf8] lock_timer_base at 1e4d22 > #1 [7fffbc60] mod_timer at 1e5f2c > #2 [7fffbcd0] __sclp_start_request at 6aabae > #3 [7fffbcf8] sclp_process_queue at 6abae0 > #4 [7fffbd38] do_ext_interrupt at 10b90e > #5 [7fffbd80] __handle_irq_event_percpu at 1cb590 > #6 [7fffbdf0] handle_irq_event_percpu at 1cb746 > #7 [7fffbe28] handle_percpu_irq at 1d04a4 > #8 [7fffbe58] generic_handle_irq at 1ca53a > #9 [7fffbe70] do_IRQ at 10bd2e > #10 [7fffbea8] ext_int_handler at 85b2c4 > PSW: 0404c00180000000 0000000000102bfe (enabled_wait+78) > GPRS: d342e71c70aa3658 0000000000000000 000000007ff5a200 0706c00180000000 > 00000000001b3f20 000000000086e4a0 ffffffffffffffff 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000d790b8 > 000000007ff5a200 0000000000860348 0000000000102bfe 0000000000c2fe48 > #0 [00c2fe70] arch_cpu_idle at 102f52 > #1 [00c2fe90] do_idle at 1b3f3a > #2 [00c2feb8] cpu_startup_entry at 1b4176 > #3 [00c2fee0] start_kernel at d24bd8 > #4 [00c2ff60] _stext at 100020 > > PID: 0 TASK: 7d8eb600 CPU: 1 COMMAND: "swapper/1" > LOWCORE INFO: > -psw : 0x0404c00180000000 0x00000000001163a6 > -function : smp_yield_cpu at 1163a6 > -prefix : 0x7d756000 > -cpu timer: 0x7fffffece146c32a > -clock cmp: 0x42e7215bd6c88e00 > #0 [7d75bcb0] arch_spin_lock_wait at 850298 > #1 [7d75bd08] sclp_interrupt_handler at 6abeda > #2 [7d75bd38] do_ext_interrupt at 10b90e > #3 [7d75bd80] __handle_irq_event_percpu at 1cb590 > #4 [7d75bdf0] handle_irq_event_percpu at 1cb746 > #5 [7d75be28] handle_percpu_irq at 1d04a4 > #6 [7d75be58] generic_handle_irq at 1ca53a > #7 [7d75be70] do_IRQ at 10bd2e > #8 [7d75bea8] ext_int_handler at 85b2c4 > PSW: 0404c00180000000 0000000000102bfe (enabled_wait+78) > GPRS: 0000000003247e88 0000000000000000 000000007ff71200 0706c00180000000 > 00000000001b3f20 000000000086e4a0 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000001 > 000000007ff71200 0000000000860348 0000000000102bfe 000000007d8ffc78 > #0 [7d8ffca0] arch_cpu_idle at 102f52 > #1 [7d8ffcc0] do_idle at 1b3f3a > #2 [7d8ffce8] cpu_startup_entry at 1b4176 > #3 [7d8ffd10] smp_start_secondary at 115d96 > #4 [7d8ffd50] restart_int_handler at 85b60a > > PID: 84 TASK: 3305a00 CPU: 2 COMMAND: "sh" > LOWCORE INFO: > -psw : 0x0404c00180000000 0x00000000001163a6 > -function : smp_yield_cpu at 1163a6 > -prefix : 0x7d780000 > -cpu timer: 0x7fffffecf69f4974 > -clock cmp: 0x42e71c731cb22c00 > > #0 [033476e0] arch_spin_lock_wait at 850298 > #1 [03347738] lock_timer_base at 1e4d22 > #2 [033477a0] mod_timer at 1e5f2c > #3 [03347810] __sclp_vt220_write at 6ba912 > #4 [033478a0] sclp_vt220_con_write at 6ba9ac > #5 [033478f8] console_unlock at 1c87c8 > #6 [03347978] vprintk_emit at 1c8bbe > #7 [03347a08] vprintk_default at 1c8e1c > #8 [03347a68] printk at 1c9d1e > #9 [03347af8] timers_dead_cpu at 1e66f6 > #10 [03347b68] cpuhp_invoke_callback at 169b50 > #11 [03347c00] _cpu_down at 851522 > #12 [03347c58] do_cpu_down at 16b9fa > #13 [03347c88] device_offline at 5a7826 > #14 [03347cc0] online_store at 5a796e > #15 [03347cf8] kernfs_fop_write at 3ed8d2 > #16 [03347d48] __vfs_write at 34ddb6 > #17 [03347e00] vfs_write at 34e10c > #18 [03347e60] sys_write at 34e44e > #19 [03347ea8] system_call at 85abf4 > > > Reverting the patch fixes the issue, but I do not yet understand why. Welcome to my world! ;-) Hmmm... I have to ask... Have you tried this with lockdep? The spinning on CPUs is suspicious, though not something that I have seen. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html