Hi Finn,
Am 15.03.2024 um 20:24 schrieb Finn Thain:
On Fri, 15 Mar 2024, Michael Schmitz wrote:
No luck with whatever I tried around signals, cache maintenance and mm.
The 'BUG: Bad rss-counter state' message suggests we're freeing the same
page ranges twice, sometimes in many cases. I cannot quite see how
preempting the kernel on interupt return would cause this. Signal
forcing process exit but process exiting before signal is received due
to preemption? But skipping preemption when a signal is pending did not
change anything in my tests...
Running out of ideas here, sorry.
FWIW, I found that the failure mode (with CONFIG_PREEMPT) changed
significantly after I disabled hard irqs in do_IRQ() using the patch I
sent on the 8th. In three stress-ng test runs, I got a soft lockup, a WARN
from set_fc() and some CONFIG_DEBUG_LIST failures...
Yes, I do see that with your patch, too. I still see the old 'table
already free' bug, though.
As far as I can see, the set_fc warning is from access_error040 and is
part of the access error exception that is taken in interrupt context.
The question is basically - why is __free_one_page() called from
interrupt context? Did that also happen before Geert's preemption patch?
preempt_schedule_irq() is currently called in both the interrupt and
exception return code path. Maybe it should only be called on exception
(i.e. syscall) return?
Cheers,
Michael
# /root/stress-ng -t 60 --zombie -1
stress-ng: info: [46] setting to a 60 second run per stressor
stress-ng: info: [46] dispatching hogs: 1 zombie
[ 29.670000] ------------[ cut here ]------------
[ 29.670000] WARNING: CPU: 0 PID: 0 at arch/m68k/include/asm/processor.h:92 buserr_c+0x486/0x5ba
[ 29.670000] Modules linked in:
[ 29.670000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc7-mac-00234-g138ba0024ff0 #9
[ 29.670000] Stack from 00583c20:
[ 29.670000] 00583c20 0054af2c 0054af2c 00000000 006e0100 00583c40 004980b6 0054af2c
[ 29.670000] 00583c54 00492ff6 00000505 0059ff94 0b46005f 00583c84 0001c73e 0053a3bc
[ 29.670000] 0000005c 000061b6 00000009 00000000 00000000 00000505 00583cc8 00696ef0
[ 29.670000] 008dc490 00583cbc 000061b6 0053a3bc 0000005c 00000009 00000000 00000000
[ 29.670000] 006eee78 00002ff5 00000008 00000001 005b0b4c 006eee74 02d61260 00583d34
[ 29.670000] 000026b4 00583cc8 006eee78 00000000 006eee78 00002ff5 00000008 005b0b4c
[ 29.670000] Call Trace: [<004980b6>] dump_stack+0x10/0x16
[ 29.670000] [<00492ff6>] __warn+0x90/0xb6
[ 29.670000] [<0001c73e>] warn_slowpath_fmt+0x10e/0x1a4
[ 29.670000] [<000061b6>] buserr_c+0x486/0x5ba
[ 29.670000] [<000061b6>] buserr_c+0x486/0x5ba
[ 29.670000] [<00002ff5>] __get_wchan+0x4d/0x58
[ 29.670000] [<000026b4>] buserr+0x20/0x28
[ 29.670000] [<00002ff5>] __get_wchan+0x4d/0x58
[ 29.670000] [<000ea558>] __free_one_page+0x2ec/0x532
[ 29.670000] [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
[ 29.670000] [<00002ff5>] __get_wchan+0x4d/0x58
[ 29.670000] [<00002700>] ret_from_kernel_thread+0xc/0x14
[ 29.670000] [<000eaa86>] free_unref_page_commit+0x15e/0x314
[ 29.670000] [<00002ce0>] show_regs+0x48/0xb8
[ 29.670000] [<00002004>] _start+0x4/0x8
[ 29.670000] [<000ed452>] free_unref_page+0x12a/0x190
[ 29.670000] [<00065be4>] rcu_cblist_dequeue+0x0/0x28
[ 29.670000] [<000ed648>] __free_pages+0xb4/0x10c
[ 29.670000] [<00018340>] thread_stack_free_rcu+0x3a/0x44
[ 29.670000] [<0006436c>] rcu_core+0x2c2/0x79e
[ 29.670000] [<00054fda>] handle_irq_event+0x7a/0x90
[ 29.670000] [<00064928>] rcu_core_si+0x8/0xc
[ 29.670000] [<0049952c>] __do_softirq+0xa4/0x1f8
[ 29.670000] [<00492600>] memmove+0x8c/0xe0
[ 29.670000] [<00038196>] parse_args+0x0/0x3a6
[ 29.670000] [<000485f8>] arch_cpu_idle_exit+0x0/0x8
[ 29.670000] [<000485f0>] arch_cpu_idle_enter+0x0/0x8
[ 29.670000] [<0001fee6>] irq_exit+0x44/0x4e
[ 29.670000] [<00002986>] do_IRQ+0x32/0x6c
[ 29.670000] [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[ 29.670000] [<00498d66>] default_idle_call+0x0/0x46
[ 29.670000] [<00002874>] auto_irqhandler_fixup+0x4/0xc
[ 29.670000] [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[ 29.670000] [<00498d66>] default_idle_call+0x0/0x46
[ 29.670000] [<00498d88>] default_idle_call+0x22/0x46
[ 29.670000] [<00048680>] do_idle+0x6a/0xf0
[ 29.670000] [<00048616>] do_idle+0x0/0xf0
[ 29.670000] [<00036742>] find_task_by_pid_ns+0x0/0x2a
[ 29.670000] [<0005cfd4>] __rcu_read_lock+0x0/0x12
[ 29.670000] [<000489cc>] cpu_startup_entry+0x1a/0x1c
[ 29.670000] [<00063976>] __rcu_read_unlock+0x0/0x26
[ 29.670000] [<00498e4a>] kernel_init+0x0/0xfa
[ 29.670000] [<004897c0>] strcpy+0x0/0x1e
[ 29.670000] [<0049379c>] _printk+0x0/0x16
[ 29.670000] [<00489942>] strlen+0x0/0x22
[ 29.670000] [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
[ 29.670000] [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
[ 29.670000] [<00621918>] console_on_rootfs+0x0/0x60
[ 29.670000] [<00620410>] _sinittext+0x410/0xadc
[ 29.670000]
[ 29.670000] ---[ end trace 0000000000000000 ]---
[ 29.670000] Unable to handle kernel access at virtual address 0b46005f
[ 29.670000] Oops: 00000000
[ 29.670000] Modules linked in:
[ 29.670000] PC: [<002951de>] __list_add_valid_or_report+0x1a/0x102
[ 29.670000] SR: 2700 SP: 00583cc8 a2: 00587590
[ 29.670000] d0: 000007e0 d1: 006eee78 d2: 00000000 d3: 006eee78
[ 29.670000] d4: 00002ff5 d5: 00000008 a0: 005b0b4c a1: 0b46005b
[ 29.670000] Process swapper (pid: 0, task=00587590)
[ 29.670000] Frame format=7 eff addr=0b46005f ssw=0505 faddr=0b46005f
[ 29.670000] wb 1 stat/addr/data: 0000 00000000 00000000
[ 29.670000] wb 2 stat/addr/data: 0000 00000000 00000000
[ 29.670000] wb 3 stat/addr/data: 0000 0b46005f 00000000
[ 29.670000] push data: 00000000 00000000 00000000 00000000
[ 29.670000] Stack from 00583d30:
[ 29.670000] 00000000 00583d6c 000ea558 006eee78 005b0b4c 0b46005b 007a5380 00000000
[ 29.670000] 0000000f 0000000c 00000005 00000001 006eee78 007a5358 005b2090 00583dc8
[ 29.670000] 000ea8bc 006eee74 00002ff5 005b0314 00000000 000000fc 00000000 00000f07
[ 29.670000] 00000003 000001ca 000000e0 00000001 006e307c 007a5358 005b0314 00000011
[ 29.670000] 006eee74 00000039 00002700 005b04fc 00000000 000000aa 00583e08 000eaa86
[ 29.670000] 005b0314 0000004e 007a5358 00000003 00000001 00002ce0 00000000 00000000
[ 29.670000] Call Trace: [<000ea558>] __free_one_page+0x2ec/0x532
[ 29.670000] [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
[ 29.670000] [<00002ff5>] __get_wchan+0x4d/0x58
[ 29.670000] [<00002700>] ret_from_kernel_thread+0xc/0x14
[ 29.670000] [<000eaa86>] free_unref_page_commit+0x15e/0x314
[ 29.670000] [<00002ce0>] show_regs+0x48/0xb8
[ 29.670000] [<00002004>] _start+0x4/0x8
[ 29.670000] [<000ed452>] free_unref_page+0x12a/0x190
[ 29.670000] [<00065be4>] rcu_cblist_dequeue+0x0/0x28
[ 29.670000] [<000ed648>] __free_pages+0xb4/0x10c
[ 29.670000] [<00018340>] thread_stack_free_rcu+0x3a/0x44
[ 29.670000] [<0006436c>] rcu_core+0x2c2/0x79e
[ 29.670000] [<00054fda>] handle_irq_event+0x7a/0x90
[ 29.670000] [<00064928>] rcu_core_si+0x8/0xc
[ 29.670000] [<0049952c>] __do_softirq+0xa4/0x1f8
[ 29.670000] [<00492600>] memmove+0x8c/0xe0
[ 29.670000] [<00038196>] parse_args+0x0/0x3a6
[ 29.670000] [<000485f8>] arch_cpu_idle_exit+0x0/0x8
[ 29.670000] [<000485f0>] arch_cpu_idle_enter+0x0/0x8
[ 29.670000] [<0001fee6>] irq_exit+0x44/0x4e
[ 29.670000] [<00002986>] do_IRQ+0x32/0x6c
[ 29.670000] [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[ 29.670000] [<00498d66>] default_idle_call+0x0/0x46
[ 29.670000] [<00002874>] auto_irqhandler_fixup+0x4/0xc
[ 29.670000] [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[ 29.670000] [<00498d66>] default_idle_call+0x0/0x46
[ 29.670000] [<00498d88>] default_idle_call+0x22/0x46
[ 29.670000] [<00048680>] do_idle+0x6a/0xf0
[ 29.670000] [<00048616>] do_idle+0x0/0xf0
[ 29.670000] [<00036742>] find_task_by_pid_ns+0x0/0x2a
[ 29.670000] [<0005cfd4>] __rcu_read_lock+0x0/0x12
[ 29.670000] [<000489cc>] cpu_startup_entry+0x1a/0x1c
[ 29.670000] [<00063976>] __rcu_read_unlock+0x0/0x26
[ 29.670000] [<00498e4a>] kernel_init+0x0/0xfa
[ 29.670000] [<004897c0>] strcpy+0x0/0x1e
[ 29.670000] [<0049379c>] _printk+0x0/0x16
[ 29.670000] [<00489942>] strlen+0x0/0x22
[ 29.670000] [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
[ 29.670000] [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
[ 29.670000] [<00621918>] console_on_rootfs+0x0/0x60
[ 29.670000] [<00620410>] _sinittext+0x410/0xadc
[ 29.670000]
[ 29.670000] Code: 206e 000c 226e 0010 4a88 6756 4a89 6774 <2029> 0004 b1c0 6600 008e 2410 b3c2 6600 00ae b288 670e 7001 b289 6708 242e fffc
[ 29.670000] Disabling lock debugging due to kernel taint
[ 29.670000] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 29.670000] ---[ end Kernel panic - not syncing: Aiee, killing interrupt handler! ]---