On Wed, Dec 27, 2023 at 03:40:35PM +0800, Wen Gu wrote: > A crash was found when dumping SMC-R connections. It can be reproduced > by following steps: > > - environment: two RNICs on both sides. > - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group > will be created. > - set the first RNIC down on either side and link group will turn to > SMC_LGR_ASYMMETRIC_LOCAL then. > - run 'smcss -R' and the crash will be triggered. > > BUG: kernel NULL pointer dereference, address: 0000000000000010 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51 > RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag] > Call Trace: > <TASK> > ? __die+0x24/0x70 > ? page_fault_oops+0x66/0x150 > ? exc_page_fault+0x69/0x140 > ? asm_exc_page_fault+0x26/0x30 > ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag] > smc_diag_dump_proto+0xd0/0xf0 [smc_diag] > smc_diag_dump+0x26/0x60 [smc_diag] > netlink_dump+0x19f/0x320 > __netlink_dump_start+0x1dc/0x300 > smc_diag_handler_dump+0x6a/0x80 [smc_diag] > ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag] > sock_diag_rcv_msg+0x121/0x140 > ? __pfx_sock_diag_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x5a/0x110 > sock_diag_rcv+0x28/0x40 > netlink_unicast+0x22a/0x330 > netlink_sendmsg+0x240/0x4a0 > __sock_sendmsg+0xb0/0xc0 > ____sys_sendmsg+0x24e/0x300 > ? copy_msghdr_from_user+0x62/0x80 > ___sys_sendmsg+0x7c/0xd0 > ? __do_fault+0x34/0x1a0 > ? do_read_fault+0x5f/0x100 > ? do_fault+0xb0/0x110 > __sys_sendmsg+0x4d/0x80 > do_syscall_64+0x45/0xf0 > entry_SYSCALL_64_after_hwframe+0x6e/0x76 > > When the first RNIC is set down, the lgr->lnk[0] will be cleared and an > asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1] > by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections > in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting > in this issue. So fix it by accessing the right link. > > Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets") > Reported-by: henaumars <henaumars@xxxxxxxx> > Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616 What about using Link: http... here? > Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx> Reviewed-by: Tony Lu <tonylu@xxxxxxxxxxxxxxxxx> > --- > net/smc/smc_diag.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c > index a584613aca12..5cc376834c57 100644 > --- a/net/smc/smc_diag.c > +++ b/net/smc/smc_diag.c > @@ -153,8 +153,7 @@ static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb, > .lnk[0].link_id = link->link_id, > }; > > - memcpy(linfo.lnk[0].ibname, > - smc->conn.lgr->lnk[0].smcibdev->ibdev->name, > + memcpy(linfo.lnk[0].ibname, link->smcibdev->ibdev->name, > sizeof(link->smcibdev->ibdev->name)); > smc_gid_be16_convert(linfo.lnk[0].gid, link->gid); > smc_gid_be16_convert(linfo.lnk[0].peer_gid, link->peer_gid); > -- > 2.43.0