I've been looking a bit closer at the RCU problem on Alpha, in the case with the bug related to interface-renaming after the changes in the networking code the code fails with an invalid pointer reference. From the stack trace one can conclude that this happens when using synchronize_rcu_expedited() in stead of synchronize_rcu_normal(). The use of rcu_normal can be enforced by setting kernel parameter rcupdate.rcu_normal=1 at boot. This makes recent kernels boot again on my Alphas, a simple enough workaround for now. The code fails inside work-queue handler wait_rcu_exp_gp() when its trying to call rcu_exp_sel_wait_wake(). looking at the code generated from the compiler the call to rcu_exp_sel_wait_wake() appears to be inline-optimized, so no actual call to this function. If I add some bogus-code (i.e a print call that references the address of a local variable, something that the compiler can't optimize away) before the call to rcu_exp_sel_wait_wake(), the code works! The same effect is achieved by declaring the local variable as volatile. I've also noted a similar behavior in the scsi driver code, where unloading of a scsi driver kernel module (in my case qla1280) will trigger a kernel Oops. As in the example above, this can be mitigated by adding a reference to local variables. When doing "rmmod qla1280" scsi_host_dev_release() calls rcu_barrier(). In this function call I noticed that the stack was somehow corrupted and the return address to scsi_host_dev_release() was overwritten. The stack corruption occurs in the "for_each_possible_cpu(cpu)" loop inside rcu_barrier(). Below are stack dumps from before/after the for_each_possible_cpu loop. The call to scsi_host_dev_release disappears in stack trace since its return address (fffffc0000b6a3ec) is replaced by a '1' and at the of the call to rcu_barrier(). We get a kernel Oops since the $ra=1 is used as return address. In both RCU cases above, stack corruption occurs and the sections that cause problems involve the use of kernel threads so concurrency might be an issue here. Since the RCU code works on other platforms and can be "fixed" on Alpha as well just by declaring certain variables as volatile (or by other means making sure that they are not optimized away from the code) can this be a compiler issue on alpha or is it the result of not taking proper measures, in the code, to account for the weak memory model on Alpha? Or a combination of the two? /Magnus Lindholm Stack traces showing the corrupted stack frames: ---------------------------------------------------------------- rcu: inside rcu_barrier 5 CPU: 1 UID: 0 PID: 1430 Comm: rmmod Not tainted 6.12.1-gentoo #43 fffffc000987fc88 fffffc0000e66440 fffffc00003a8bc8 0000000000000000 fffffc0000e667b0 fffffc000480b5d8 fffffc0000b6a3ec fffffc0004a2a000 fffffc0004a2a240 fffffc000480b5d8 0000000000000000 fffffffc00502068 0000020001043480 00000200010422a0 0000000000000000 0000000000000000 fffffc0000b68efc fffffc0004a2a240 fffffc0006319300 0000000000000000 fffffc0000b2ed80 fffffc0004a2a240 fffffc0000b9d278 0000000000000000 Trace: [<fffffc00003a8bc8>] rcu_barrier+0x1f8/0x580 [<fffffc0000b6a3ec>] scsi_host_dev_release+0xac/0x1cc [<fffffc0000b68efc>] device_release+0x148/0x218 [<fffffc0000b2ed80>] kobject_put+0x1d0/0x270 [<fffffc00007cac3c>] put_device+0x1c/0x30 [<fffffc00007f47cc>] scsi_host_put+0x1c/0x30 [<fffffc00007554a4>] pci_device_remove+0x34/0x90 [<fffffc00007d5c04>] device_remove+0x64/0xb0 [<fffffc00007d7694>] device_release_driver_internal+0x294/0x380 [<fffffc00007d783c>] driver_detach+0x7c/0x110 [<fffffc00007d5240>] bus_remove_driver+0xa0/0x150 [<fffffc00007d80c4>] driver_unregister+0x44/0xa0 [<fffffc00007552f8>] pci_unregister_driver+0x38/0xd0 [<fffffc00003bbb7c>] sys_delete_module+0x19c/0x320 [<fffffc0000310d34>] entSys+0xa4/0xc0 rcu: inside rcu_barrier 6 CPU: 1 UID: 0 PID: 1430 Comm: rmmod Not tainted 6.12.1-gentoo #43 fffffc000987fc88 fffffc0000e66440 fffffc00003a8c44 0000000000000002 fffffc0000e667b0 fffffc0000e44240 0000000000000001 fffffc0004a2a000 fffffc0004a2a240 fffffc000480b5d8 0000000000000000 fffffffc00502068 0000020001043480 00000200010422a0 0000000000000000 0000000000000000 fffffc0000b68efc fffffc0004a2a240 fffffc0006319300 0000000000000000 fffffc0000b2ed80 fffffc0004a2a240 fffffc0000b9d278 0000000000000000 Trace: [<fffffc00003a8c44>] rcu_barrier+0x274/0x580 [<fffffc0000b68efc>] device_release+0x148/0x218 [<fffffc0000b2ed80>] kobject_put+0x1d0/0x270 [<fffffc00007cac3c>] put_device+0x1c/0x30 [<fffffc00007f47cc>] scsi_host_put+0x1c/0x30 [<fffffc00007554a4>] pci_device_remove+0x34/0x90 [<fffffc00007d5c04>] device_remove+0x64/0xb0 [<fffffc00007d7694>] device_release_driver_internal+0x294/0x380 [<fffffc00007d783c>] driver_detach+0x7c/0x110 [<fffffc00007d5240>] bus_remove_driver+0xa0/0x150 [<fffffc00007d80c4>] driver_unregister+0x44/0xa0 [<fffffc00007552f8>] pci_unregister_driver+0x38/0xd0 [<fffffc00003bbb7c>] sys_delete_module+0x19c/0x320 [<fffffc0000310d34>] entSys+0xa4/0xc0 Unable to handle kernel paging request at virtual address 0000000000000000 CPU 1 rmmod(1430): Oops -1 pc = [<0000000000000000>] ra = [<0000000000000001>] ps = 0000 Not tainted pc is at 0x0 ra is at 0x1 v0 = 0000000000000007 t0 = fffffc0000ec7aa8 t1 = ffffffffffffffff t2 = fffffc0000e65df0 t3 = 00000000000026f0 t4 = 00000000000028f1 t5 = 00000000000c2e20 t6 = 00000000000c2e68 t7 = fffffc000987c000 s0 = fffffc0004a2a000 s1 = fffffc0004a2a240 s2 = fffffc000480b5d8 s3 = 0000000000000000 s4 = fffffffc00502068 s5 = 0000020001043480 s6 = 00000200010422a0 a0 = 0000000000000000 a1 = 0000000000000001 a2 = 00000000000028f0 a3 = fffffc000987fa38 a4 = 0000000000000000 a5 = 0000000000000000 t8 = 00000000000c2e20 t9 = ffffffffffffffec t10= 0000000000000001 t11= 00000001000024f0 pv = fffffc000038a1f0 at = 0000000000000000 gp = fffffc0000eb7aa8 sp = 00000000183e6a07 Disabling lock debugging due to kernel taint Trace: [<fffffc0000b68efc>] device_release+0x148/0x218 [<fffffc0000b2ed80>] kobject_put+0x1d0/0x270 [<fffffc00007cac3c>] put_device+0x1c/0x30 [<fffffc00007f47cc>] scsi_host_put+0x1c/0x30 [<fffffc00007554a4>] pci_device_remove+0x34/0x90 [<fffffc00007d5c04>] device_remove+0x64/0xb0 [<fffffc00007d7694>] device_release_driver_internal+0x294/0x380 [<fffffc00007d783c>] driver_detach+0x7c/0x110 [<fffffc00007d5240>] bus_remove_driver+0xa0/0x150 [<fffffc00007d80c4>] driver_unregister+0x44/0xa0 [<fffffc00007552f8>] pci_unregister_driver+0x38/0xd0 [<fffffc00003bbb7c>] sys_delete_module+0x19c/0x320 [<fffffc0000310d34>] entSys+0xa4/0xc0 Below are the changes I made to the kernel source in order mitigate the stack corruption problem this is not really a fix but it can be of use to gain further knowledge on whats really going on: ------------------------------------------------------------------------------------ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ff98233d4aa5..8241313404f7 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4553,7 +4553,7 @@ static void rcu_barrier_handler(void *cpu_in) */ void rcu_barrier(void) { - uintptr_t cpu; + volatile uintptr_t cpu; unsigned long flags; unsigned long gseq; struct rcu_data *rdp; diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index fb664d3a01c9..afba0ebc80e4 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -477,7 +477,7 @@ static inline void sync_rcu_exp_select_cpus_flush_work(struct rcu_node *rnp) */ static void wait_rcu_exp_gp(struct kthread_work *wp) { - struct rcu_exp_work *rewp; + volatile struct rcu_exp_work *rewp; rewp = container_of(wp, struct rcu_exp_work, rew_work); rcu_exp_sel_wait_wake(rewp->rew_s); @@ -705,6 +705,7 @@ static void rcu_exp_wait_wake(unsigned long s) */ static void rcu_exp_sel_wait_wake(unsigned long s) { + pr_warn("inside rcu_exp_sel_wait_wake, %llx\n",(void*)s); /* Initialize the rcu_node tree in preparation for the wait. */ sync_rcu_exp_select_cpus(); On Sun, Dec 1, 2024 at 6:04 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Sun, Dec 01, 2024 at 11:09:10AM +0100, Magnus Lindholm wrote: > > On Sun, Dec 1, 2024 at 5:31 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > > > > Does booting with the "rcupdate.rcu_normal=1" kernel boot parameter > > > also suppress the problem? > > > > setting rcupdate.rcu_normal=1 also suppresses the problem. I guess this makes > > RCU code not do synchronize_rcu_normal() in stead of the full > > synchronize_rcu_expedited() which is where I get the kernel Oops. > > Exactly, though the effect is that any call to synchronize_rcu_expedited() > instead results in a call to synchronize_rcu(). > > Which means that you can work around this problem without having to > carry patches and without having to slow down network configuration for > everyone else. ;-) > > > > That "pc =" down below is the program counter? If so, I am at a loss > > > as to what RCU could do to make it be zero. > > > > No sure why this happens, if the RCU code is passing around pointers to > > worker function and this somehow ends up being a null pointer on the Alpha? > > Are frame pointers enabled on your setup? If not, could you please > enable them and reproduce the problem? Could you also please try > building and reproducing with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y? > > Thanx, Paul