Re: [RFC] bulk zero copy transport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dennis,

just as a wild idea, would be an option to use the SMB-Direct [1] protocol defined here?

It basically provides a stream/packet like transport based on IB_WR_SEND[_WITH_INV]
and in addition it allows direct memory transfers with IB_WR_RDMA_READ or IB_WR_RDMA_WRITE.

It's called SMB-Direct as it's currently used as a transport for the SMB3 protocol,
but it could also be used as transport for other things.

Over the last years I've been working on a PF_SMBDIRECT socket driver [2]
as a hobby project in order to support it for Samba. It's not yet production ready
and has known memory leaks, but the basics already work. The api [3] is based on
sendmsg/recvmsg with using MSG_OOB with msg->msg_control for direct memory transfers.
I'll actually use it with IORING_OP_SENDMSG and IORING_OP_RECVMSG, which allow msg->msg_control
starting 5.12 kernels.

metze

[1] https://winprotocoldoc.blob.core.windows.net/productionwindowsarchives/MS-SMBD/%5bMS-SMBD%5d.pdf
[2] https://git.samba.org/?p=metze/linux/smbdirect.git;a=summary
[3] https://git.samba.org/?p=metze/linux/smbdirect.git;a=blob;f=smbdirect.h;hb=refs/heads/smbdirect-work-in-progress


Am 19.08.21 um 21:09 schrieb Dennis Dalessandro:

> Just wanted to float an idea we are thinking about. It builds on the basic idea
> of what Intel submitted as their RV module [1]. This however does things a bit
> differently and is really all about bulk zero-copy using the kernel. It is a new
> ULP.
> 
> The major differences are that there will be no new cdev needed. We will make
> use of the existing HFI1 cdev where an FD is needed. We also propose to make use
> of IO-Uring (hence needing FD) to get requests into the kernel. The idea will be
> to not share Uverbs objects with the kernel. The kernel will maintain
> ownership of the qp, pd, mr, cq, etc.
> 
> Connections we envision to be maintained by the kernel using RDMA CM. Similar in
> fashion to how RDS or IPoIB works. This of course means an RC QP which allows
> our TID RDMA feature to work under the hood.
> 
> We have looked into RDS and RTRS and both seem to be the wrong interface. RDS
> provides a lot of what we are looking for but it seems to be a bit overkill and
> has higher overhead than we hope to achieve. Performance results show it to be
> less performant than direct to verbs.
> 
> After reviewing the RV submission, I don't think there is any reason to try to
> revamp that submission. It seems to be very tightly tied to PSM3 whereas this is
> meant to be more generic.
> 
> At this point we are interested in what questions you would have or opinions. We
> would like to get some feedback early in the process. As we develop the code
> we'll continue to post, similar to how we did rdmavt and welcome anyone that
> wants to collaborate.
> 
> [1] https://lore.kernel.org/linux-rdma/20210319125635.34492-1-kaike.wan@xxxxxxxxx/
> 
> -Denny
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux