On Thu, Jun 01, 2023 at 04:41:52PM +0800, Wen Gu wrote: > SMCRv1 has a similar issue to SMCRv2 (see link below) that may access > invalid MRs of RMBs when construct LLC ADD LINK CONT messages. > > BUG: kernel NULL pointer dereference, address: 0000000000000014 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 5 PID: 48 Comm: kworker/5:0 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #49 > Workqueue: events smc_llc_add_link_work [smc] > RIP: 0010:smc_llc_add_link_cont+0x160/0x270 [smc] > RSP: 0018:ffffa737801d3d50 EFLAGS: 00010286 > RAX: ffff964f82144000 RBX: ffffa737801d3dd8 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff964f81370c30 > RBP: ffffa737801d3dd4 R08: ffff964f81370000 R09: ffffa737801d3db0 > R10: 0000000000000001 R11: 0000000000000060 R12: ffff964f82e70000 > R13: ffff964f81370c38 R14: ffffa737801d3dd3 R15: 0000000000000001 > FS: 0000000000000000(0000) GS:ffff9652bfd40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000014 CR3: 000000008fa20004 CR4: 00000000003706e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > smc_llc_srv_rkey_exchange+0xa7/0x190 [smc] > smc_llc_srv_add_link+0x3ae/0x5a0 [smc] > smc_llc_add_link_work+0xb8/0x140 [smc] > process_one_work+0x1e5/0x3f0 > worker_thread+0x4d/0x2f0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xe5/0x120 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x2c/0x50 > </TASK> > > When an alernate RNIC is available in system, SMC will try to add a new > link based on the RNIC for resilience. All the RMBs in use will be mapped > to the new link. Then the RMBs' MRs corresponding to the new link will > be filled into LLC messages. For SMCRv1, they are ADD LINK CONT messages. > > However smc_llc_add_link_cont() may mistakenly access to unused RMBs which > haven't been mapped to the new link and have no valid MRs, thus causing a > crash. So this patch fixes it. > > Fixes: 87f88cda2128 ("net/smc: rkey processing for a new link as SMC client") > Link: https://lore.kernel.org/r/1685101741-74826-3-git-send-email-guwen@xxxxxxxxxxxxxxxxx > Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx> This SGTM, thanks. Reviewed-by: Tony Lu <tonylu@xxxxxxxxxxxxxxxxx> > --- > net/smc/smc_llc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c > index 7a8d916..90f0b60 100644 > --- a/net/smc/smc_llc.c > +++ b/net/smc/smc_llc.c > @@ -851,6 +851,8 @@ static int smc_llc_add_link_cont(struct smc_link *link, > addc_llc->num_rkeys = *num_rkeys_todo; > n = *num_rkeys_todo; > for (i = 0; i < min_t(u8, n, SMC_LLC_RKEYS_PER_CONT_MSG); i++) { > + while (*buf_pos && !(*buf_pos)->used) > + *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos); > if (!*buf_pos) { > addc_llc->num_rkeys = addc_llc->num_rkeys - > *num_rkeys_todo; > @@ -867,8 +869,6 @@ static int smc_llc_add_link_cont(struct smc_link *link, > > (*num_rkeys_todo)--; > *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos); > - while (*buf_pos && !(*buf_pos)->used) > - *buf_pos = smc_llc_get_next_rmb(lgr, buf_lst, *buf_pos); > } > addc_llc->hd.common.llc_type = SMC_LLC_ADD_LINK_CONT; > addc_llc->hd.length = sizeof(struct smc_llc_msg_add_link_cont); > -- > 1.8.3.1