On 21-Dec-18 01:10, Jason Gunthorpe wrote: > On Thu, Dec 20, 2018 at 11:28:34AM +0200, Gal Pressman wrote: > >> It looks like the discussion didn't come to a conclusion, I'm trying >> to come up with a plan going forward and would like to get your >> opinion. > > I haven't heard any of the people who complained about usnic chime in > here.. > >> We can implement an rdma-core (libibverbs) userspace provider with >> support for standard UD (including the 40 bytes offset) and SRD QPs >> through direct verbs. I'll also add documentation for SRD QP type, >> even if we end up using it as a driver QP type. > > There is no reason to implement code you don't intend to maintain and > use. But if you are using the kernel QPT's you must follow the > protocols in the kernel driver. You should probably just give up on > the UD QPT if the HW doesn't follow verbs and use QPT_DRIVER2 for it. The ways in which we aren't UD QPT are very small, we're 90% there and the remaining 10% are simple enough. As I said, the 40 bytes thing will be changed. We will implement the missing QP states handling for UD QPs in firmware, without any driver/provider hacks. > > You also need to pass the rdma-core ABI checks, but that does not > require implementing a provider. Ofcourse. > >> The create/destroy AH issue will be solved with the sleepable flag, >> EFA can return -EOPNOTSUPP when called in an atomic context. When >> we'll add kernel verbs we can solve that the same way bnxt driver >> did (polling for completion). > > This seems fine ACK. > >> The EFA wire protocol is tightly coupled to the wire protocol for >> EC2’s VPC software defined network, which Amazon considers one of >> its proprietary differentiating features. > > This is a problem, we have never allowed proprietary wire protocols to > set a standard for multi-vendor verbs. > > We've allowed a lot of stuff in extended verbs, but never that. > > So that alone says the EFA proprietary stuff cannot be part of common > verbs at this point. But this is what QPT_DRIVER is for, so it isn't > insurmountable. I agree, EFA proprietary stuff should not be part of common verbs. > > But what you are left with is a driver that implements QPT_DRIVER and > nothing else. Sorry for not being clear, we will provide verbs QPT_UD in addition to QPT_DRIVER (SRD). In addition, we will provide a functional specification for SRD, its usage will be clear. > >> Kernel verbs are not supported right now, but we do have future >> plans to support that. > > This is a problem, as it was a big part of the objection to usnic. Is kernel API more important than user API? > > I'm not sure what kernel support you imagine, but nearly all existing > kernel drivers require actual RC RDMA. > > Frankly, if your device and driver would implement actual 'verbs' RC > RDMA then the 'usnic' concern goes away. > >> We are focused on our customer base, and due to our product offering >> we haven't seen customer demand for nvmeof support, which you have >> set as the bar for the RDMA subsystem. > > nvmeof uses the same feature set as all the standard storage ULPs and > represents the base level of functionality to enable alot of in-kernel > functionality. > > If you are imagining making new EFA specific kernel drivers using > proprietary protocols, then again, no, this is not what the RDMA > subsystem is for. Not sure how kernel verbs for EFA are going to look like, but I agree it should be supported through a standard IB_QPT. > >> I'd really like to avoid implementing things that do not interest >> our customers and will not have actual use. > > You should not do this, we don't want to maintain such dead code. Exactly. > >> I genuinely believe that EFA belongs in the RDMA subsystem, a lot >> more than vfio/anywhere else. > > And yet there are other teams saying the opposite about their devices > and are working to try and build a new subsystem that generically > provides WR/WC rings and userspace DMA.. > >> We enforce PDs and MRs in the device, use standard AH registration, >> remote QP numbers addressing, packet headers are constructed on the >> device, etc.. We are not simply hacking our way to userspace >> through the subsystem. > > What part of the ib_core support code does EFA actually use besides > the UAPI conduit? umem? anything else? Yes, core reference counts, resource tracking, ib_core commands validation/security checks, device registration, sysfs api. In the future, performance statistics (following Ariel's RFC), driver specific resource tracking (dump), kernel verbs. The features set will naturally increase over time. > > Jason > I'd really like to get to a point where we know how our next submission is going to look like. I understand that the lack of user code accompanied to the driver is an issue - that won't be the case in my next submission. As I said, my proposal is to add libibverbs support (with standard QPT_UD and QP states) and specification for SRD. In addition, I can come up with an RFC for hiding EFA from in-kernel API as you suggested, although we will probably revisit this in the future. Will this be acceptable? Thanks, Gal