On Sun, Oct 04, 2020 at 12:12:28PM -0700, Jianxin Xiong wrote: > Dma-buf is a standard cross-driver buffer sharing mechanism that can be > used to support peer-to-peer access from RDMA devices. > > Device memory exported via dma-buf is associated with a file descriptor. > This is passed to the user space as a property associated with the > buffer allocation. When the buffer is registered as a memory region, > the file descriptor is passed to the RDMA driver along with other > parameters. > > Implement the common code for importing dma-buf object and mapping > dma-buf pages. > > Signed-off-by: Jianxin Xiong <jianxin.xiong@xxxxxxxxx> > Reviewed-by: Sean Hefty <sean.hefty@xxxxxxxxx> > Acked-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx> > --- > drivers/infiniband/core/Makefile | 2 +- > drivers/infiniband/core/umem.c | 4 + > drivers/infiniband/core/umem_dmabuf.c | 291 ++++++++++++++++++++++++++++++++++ > drivers/infiniband/core/umem_dmabuf.h | 14 ++ > drivers/infiniband/core/umem_odp.c | 12 ++ > include/rdma/ib_umem.h | 19 ++- > 6 files changed, 340 insertions(+), 2 deletions(-) > create mode 100644 drivers/infiniband/core/umem_dmabuf.c > create mode 100644 drivers/infiniband/core/umem_dmabuf.h I think this is using ODP too literally, dmabuf isn't going to need fine grained page faults, and I'm not sure this locking scheme is OK - ODP is horrifically complicated. If this is the approach then I think we should make dmabuf its own stand alone API, reg_user_mr_dmabuf() The implementation in mlx5 will be much more understandable, it would just do dma_buf_dynamic_attach() and program the XLT exactly the same as a normal umem. The move_notify() simply zap's the XLT and triggers a work to reload it after the move. Locking is provided by the dma_resv_lock. Only a small disruption to the page fault handler is needed. > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); > + sgt = dma_buf_map_attachment(umem_dmabuf->attach, > + DMA_BIDIRECTIONAL); > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); This doesn't look right, this lock has to be held up until the HW is prorgammed The use of atomic looks probably wrong as well. > + k = 0; > + total_pages = ib_umem_odp_num_pages(umem_odp); > + for_each_sg(umem->sg_head.sgl, sg, umem->sg_head.nents, j) { > + addr = sg_dma_address(sg); > + pages = sg_dma_len(sg) >> page_shift; > + while (pages > 0 && k < total_pages) { > + umem_odp->dma_list[k++] = addr | access_mask; > + umem_odp->npages++; > + addr += page_size; > + pages--; This isn't fragmenting the sg into a page list properly, won't work for unaligned things And really we don't need the dma_list for this case, with a fixed whole mapping DMA SGL a normal umem sgl is OK and the normal umem XLT programming in mlx5 is fine. Jason