On 7/28/23 09:16, Guoqing Jiang wrote:
On 7/28/23 01:29, Jason Gunthorpe wrote:
On Thu, Jul 27, 2023 at 05:17:40PM +0000, Bernard Metzler wrote:
-----Original Message-----
From: Guoqing Jiang <guoqing.jiang@xxxxxxxxx>
Sent: Thursday, 27 July 2023 16:04
To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; jgg@xxxxxxxx;
leon@xxxxxxxxxx
Cc: linux-rdma@xxxxxxxxxxxxxxx
Subject: [EXTERNAL] [PATCH 0/5] Fix potential issues for siw
Hi,
Several issues appeared if we rmmod siw module after failed to insert
the module (with manual change like below).
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -577,6 +577,7 @@ static __init int siw_init_module(void)
if (rv)
goto out_error;
+ goto out_error;
rdma_link_register(&siw_link_ops);
Basically, these issues are double free, use before initalization or
null pointer dereference. For more details, pls review the individual
patch.
Thanks,
Guoqing
Hi Guoqing,
very good catch, thank you. I was under the wrong assumption a
module is not loaded if the init_module() returns a value.
I think that is actually true, isn't it? I'm confused?
Yes, you are right. Since rv is still 0, so the module appears in the
kernel. Not sure if some tool could inject err like this. Feel free to
ignore this.
The below trace happened if I run a stress test with load siw module and
unload siw in a loop, which should be fixed by patch 5, so I think we
need to apply it, what do you think?
[ 414.537961] BUG: spinlock bad magic on CPU#0, modprobe/3722
[ 414.537965] lock: 0xffff9d847bc380e8, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: 0
[ 414.537969] CPU: 0 PID: 3722 Comm: modprobe Tainted: G OE
6.5.0-rc3+ #16
[ 414.537971] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[ 414.537973] Call Trace:
[ 414.537973] <TASK>
[ 414.537975] dump_stack_lvl+0x77/0xd0
[ 414.537979] dump_stack+0x10/0x20
[ 414.537981] spin_bug+0xa5/0xd0
[ 414.537984] do_raw_spin_lock+0x90/0xd0
[ 414.537985] _raw_spin_lock_irqsave+0x56/0x80
[ 414.537988] ? __wake_up_common_lock+0x63/0xd0
[ 414.537990] __wake_up_common_lock+0x63/0xd0
[ 414.537992] __wake_up+0x13/0x30
[ 414.537994] siw_stop_tx_thread+0x49/0x70 [siw]
[ 414.538000] siw_exit_module+0x30/0x620 [siw]
[ 414.538006] __do_sys_delete_module.constprop.0+0x18f/0x300
[ 414.538008] ? syscall_enter_from_user_mode+0x21/0x70
[ 414.538010] ? __this_cpu_preempt_check+0x13/0x20
[ 414.538012] ? lockdep_hardirqs_on+0x86/0x120
[ 414.538014] __x64_sys_delete_module+0x12/0x20
[ 414.538016] do_syscall_64+0x5c/0x90
[ 414.538019] ? do_syscall_64+0x69/0x90
[ 414.538020] ? __this_cpu_preempt_check+0x13/0x20
[ 414.538022] ? lockdep_hardirqs_on+0x86/0x120
[ 414.538024] ? syscall_exit_to_user_mode+0x37/0x50
[ 414.538025] ? do_syscall_64+0x69/0x90
[ 414.538026] ? syscall_exit_to_user_mode+0x37/0x50
[ 414.538027] ? do_syscall_64+0x69/0x90
[ 414.538029] ? syscall_exit_to_user_mode+0x37/0x50
[ 414.538030] ? do_syscall_64+0x69/0x90
[ 414.538032] ? irqentry_exit_to_user_mode+0x25/0x30
[ 414.538033] ? irqentry_exit+0x77/0xb0
[ 414.538034] ? exc_page_fault+0xae/0x240
[ 414.538036] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 414.538038] RIP: 0033:0x7f177eb26c9b
Thanks,
Guoqing