On 1/15/23 20:21, lizhijian@xxxxxxxxxxx wrote: > > > On 14/01/2023 07:27, Bob Pearson wrote: >> Replace struct rxe-phys_buf and struct rxe_map by struct xarray >> in rxe_verbs.h. This allows using rcu locking on reads for >> the memory maps stored in each mr. >> >> This is based off of a sketch of a patch from Jason Gunthorpe in the >> link below. Some changes were needed to make this work. It applies >> cleanly to the current for-next and passes the pyverbs, perftest >> and the same blktests test cases which run today. >> >> Link:https://lore.kernel.org/linux-rdma/Y3gvZr6%2FNCii9Avy@xxxxxxxxxx/ >> Co-developed-by: Jason Gunthorpe<jgg@xxxxxxxxxx> >> Signed-off-by: Bob Pearson<rpearsonhpe@xxxxxxxxx> >> --- >> drivers/infiniband/sw/rxe/rxe_loc.h | 1 - >> drivers/infiniband/sw/rxe/rxe_mr.c | 533 ++++++++++++-------------- >> drivers/infiniband/sw/rxe/rxe_resp.c | 2 +- >> drivers/infiniband/sw/rxe/rxe_verbs.h | 21 +- >> 4 files changed, 251 insertions(+), 306 deletions(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h >> index fd70c71a9e4e..0cf78f9bb27c 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_loc.h >> +++ b/drivers/infiniband/sw/rxe/rxe_loc.h >> @@ -71,7 +71,6 @@ int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma, >> void *addr, int length, enum rxe_mr_copy_dir dir); >> int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, >> int sg_nents, unsigned int *sg_offset); >> -void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); >> int rxe_mr_do_atomic_op(struct rxe_mr *mr, u64 iova, int opcode, >> u64 compare, u64 swap_add, u64 *orig_val); >> int rxe_mr_do_atomic_write(struct rxe_mr *mr, u64 iova, void *addr); >> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c >> index fd5537ee7f04..e4634279080a 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_mr.c >> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > [snip...] > >> -static bool is_pmem_page(struct page *pg) >> +static unsigned long rxe_mr_iova_to_index(struct rxe_mr *mr, u64 iova) >> { >> - unsigned long paddr = page_to_phys(pg); >> + return (iova >> mr->page_shift) - (mr->ibmr.iova >> mr->page_shift); >> +} >> >> - return REGION_INTERSECTS == >> - region_intersects(paddr, PAGE_SIZE, IORESOURCE_MEM, >> - IORES_DESC_PERSISTENT_MEMORY); > > [snip...] > >> + rxe_mr_init(access, mr); >> + >> umem = ib_umem_get(&rxe->ib_dev, start, length, access); >> if (IS_ERR(umem)) { >> rxe_dbg_mr(mr, "Unable to pin memory region err = %d\n", >> (int)PTR_ERR(umem)); >> - err = PTR_ERR(umem); >> - goto err_out; >> + return PTR_ERR(umem); >> } >> >> - num_buf = ib_umem_num_pages(umem); >> - >> - rxe_mr_init(access, mr); >> - >> - err = rxe_mr_alloc(mr, num_buf); >> + err = rxe_mr_fill_pages_from_sgt(mr, &umem->sgt_append.sgt); >> if (err) { >> - rxe_dbg_mr(mr, "Unable to allocate memory for map\n"); >> - goto err_release_umem; >> + ib_umem_release(umem); >> + return err; >> } >> >> - num_buf = 0; >> - map = mr->map; >> - if (length > 0) { >> - bool persistent_access = access & IB_ACCESS_FLUSH_PERSISTENT; >> - >> - buf = map[0]->buf; >> - for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) { >> - struct page *pg = sg_page_iter_page(&sg_iter); >> + mr->umem = umem; >> + mr->ibmr.type = IB_MR_TYPE_USER; >> + mr->state = RXE_MR_STATE_VALID; >> >> - if (persistent_access && !is_pmem_page(pg)) { >> - rxe_dbg_mr(mr, "Unable to register persistent access to non-pmem device\n"); >> - err = -EINVAL; >> - goto err_release_umem; >> - } > > I read you removed is_pmem_page and its checking, but there is no description about this. > IMO, it's required by IBA spec. > > Thanks > Zhijian Zhijian, It was dropped by accident. I just posted an update with the calls put back. Please take a look and if possible can you test to see if it doesn't regress your pmem changes. I have no way to test pmem. Also look at a comment I added to the pmem flush call. I am suspicious of the smp_store_release() call based on the original comment. Regards, Bob