Hi Jason & Bob CC Guoqing @Guoqing, It may correlate with your previous bug report: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@xxxxxxxxx/T/ It's observed that a same MR in rnbd server will trigger below code path: -> rxe_mr_init_fast() |-> alloc map_set() # map_set is uninitialized |...-> rxe_map_mr_sg() # build the map_set |-> rxe_mr_set_page() |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means # we can access host memory(such rxe_mr_copy) |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, # but map_set was not built again |...-> rxe_mr_copy() # kernel crash due to access wild addresses # that lookup from the map_set I draft a patch like below for it, but i wonder if it's rxe's responsibility to do such checking. Any comments are very welcome. From e9d0bd821f07f5e049027f07b3ce9dc283624201 Mon Sep 17 00:00:00 2001 From: Li Zhijian <lizhijian@xxxxxxxxxxx> Date: Tue, 24 May 2022 10:56:19 +0800 Subject: [PATCH] RDMA/rxe: check map_set valid when handle IB_WR_REG_MR It's observed that a same MR in rnbd server will trigger below code path: -> rxe_mr_init_fast() |-> alloc map_set() # map_set is uninitialized |...-> rxe_map_mr_sg() # build the map_set |-> rxe_mr_set_page() |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means # we can access host memory(such rxe_mr_copy) |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, # but map_set was not built again |...-> rxe_mr_copy() # kernel crash due to access wild addresses # that lookup from the map_set Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx> --- drivers/infiniband/sw/rxe/rxe_mr.c | 9 +++++++++ drivers/infiniband/sw/rxe/rxe_verbs.c | 1 + drivers/infiniband/sw/rxe/rxe_verbs.h | 1 + 3 files changed, 11 insertions(+) diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 787c7dadc14f..09673d559c06 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -90,6 +90,7 @@ static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp) if (!set->map) goto err_free_set; + set->valid = false; for (i = 0; i < num_map; i++) { set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL); if (!set->map[i]) @@ -216,6 +217,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, } set = mr->cur_map_set; + set->valid = true; set->page_shift = PAGE_SHIFT; set->page_mask = PAGE_SIZE - 1; @@ -643,6 +645,7 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey) } mr->state = RXE_MR_STATE_FREE; + mr->cur_map_set->valid = mr->next_map_set->valid = false; ret = 0; err_drop_ref: @@ -679,12 +682,18 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) return -EINVAL; } + if (!mr->next_map_set->valid) { + pr_warn("%s: map set is not valid\n", __func__); + return -EINVAL; + } + mr->access = access; mr->lkey = (mr->lkey & ~0xff) | key; mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0; mr->state = RXE_MR_STATE_VALID; set = mr->cur_map_set; + set->valid = false; mr->cur_map_set = mr->next_map_set; mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova; mr->next_map_set = set; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 58e4412b1d16..4b7ae2d1d921 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -992,6 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, set->page_shift = ilog2(ibmr->page_size); set->page_mask = ibmr->page_size - 1; set->offset = set->iova & set->page_mask; + set->valid = true; return n; } diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 86068d70cd95..2edf31aab7e1 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -289,6 +289,7 @@ struct rxe_map { struct rxe_map_set { struct rxe_map **map; + bool valid; u64 va; u64 iova; size_t length; -- 2.31.1 On 23/05/2022 22:02, Li, Zhijian wrote: > > on 2022/5/20 22:45, Jason Gunthorpe wrote: >> On Wed, May 18, 2022 at 12:37:25PM +0800, Li Zhijian wrote: >>> Below call chains will alloc map_set without fully initializing map_set. >>> rxe_mr_init_fast() >>> -> rxe_mr_alloc() >>> -> rxe_mr_alloc_map_set() >>> >>> Uninitialized values inside struct rxe_map_set are possible to cause >>> kernel panic. >> If the value is uninitialized then why is 0 an OK value? >> >> Would be happier to know the exact value that is not initialized > > Well, good question. After re-think of this issue, it seems this patch wasn't the root cause though it made the crash disappear in some extent. > > I'm still working on the root cause :) > > Thanks > > Zhijian > > >> >> Jason > >