On Thu, Jun 20, 2019 at 10:59:44AM -0600, Logan Gunthorpe wrote: > > > On 2019-06-20 10:49 a.m., Jason Gunthorpe wrote: > > On Thu, Jun 20, 2019 at 10:12:32AM -0600, Logan Gunthorpe wrote: > >> Introduce rdma_rw_ctx_dma_init() and rdma_rw_ctx_dma_destroy() which > >> peform the same operation as rdma_rw_ctx_init() and > >> rdma_rw_ctx_destroy() respectively except they operate on a DMA > >> address and length instead of an SGL. > >> > >> This will be used for struct page-less P2PDMA, but there's also > >> been opinions expressed to migrate away from SGLs and struct > >> pages in the RDMA APIs and this will likely fit with that > >> effort. > >> > >> Signed-off-by: Logan Gunthorpe <logang@xxxxxxxxxxxx> > >> drivers/infiniband/core/rw.c | 74 ++++++++++++++++++++++++++++++------ > >> include/rdma/rw.h | 6 +++ > >> 2 files changed, 69 insertions(+), 11 deletions(-) > >> > >> diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c > >> index 32ca8429eaae..cefa6b930bc8 100644 > >> +++ b/drivers/infiniband/core/rw.c > >> @@ -319,6 +319,39 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u8 port_num, > >> } > >> EXPORT_SYMBOL(rdma_rw_ctx_init); > >> > >> +/** > >> + * rdma_rw_ctx_dma_init - initialize a RDMA READ/WRITE context from a > >> + * DMA address instead of SGL > >> + * @ctx: context to initialize > >> + * @qp: queue pair to operate on > >> + * @port_num: port num to which the connection is bound > >> + * @addr: DMA address to READ/WRITE from/to > >> + * @len: length of memory to operate on > >> + * @remote_addr:remote address to read/write (relative to @rkey) > >> + * @rkey: remote key to operate on > >> + * @dir: %DMA_TO_DEVICE for RDMA WRITE, %DMA_FROM_DEVICE for RDMA READ > >> + * > >> + * Returns the number of WQEs that will be needed on the workqueue if > >> + * successful, or a negative error code. > >> + */ > >> +int rdma_rw_ctx_dma_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, > >> + u8 port_num, dma_addr_t addr, u32 len, u64 remote_addr, > >> + u32 rkey, enum dma_data_direction dir) > > > > Why not keep the same basic signature here but replace the scatterlist > > with the dma vec ? > > Could do. At the moment, I had no need for dma_vec in this interface. I think that is because you only did nvme not srp/iser :) > >> +{ > >> + struct scatterlist sg; > >> + > >> + sg_dma_address(&sg) = addr; > >> + sg_dma_len(&sg) = len; > > > > This needs to fail if the driver is one of the few that require > > struct page to work.. > > Yes, right. Currently P2PDMA checks for the use of dma_virt_ops. And > that probably should also be done here. But is that sufficient? You're > probably right that it'll take an audit of the RDMA tree to sort that out. For this purpose I'd be fine if you added a flag to the struct ib_device_ops that is set on drivers that we know are OK.. We can make that list bigger over time. > > This is not so hard to do, as most drivers are already struct page > > free, but is pretty much blocked on needing some way to go from the > > block layer SGL world to the dma vec world that does not hurt storage > > performance. > > Maybe I can end up helping with that if it helps push the ideas here > through. (And assuming people think it's an acceptable approach for the > block-layer side of things). Let us hope for a clear decision then Jason