On Thu, 2019-01-03 at 14:22 -0700, Jason Gunthorpe wrote: > On Thu, Jan 03, 2019 at 08:14:54PM +0000, Hefty, Sean wrote: > > > I assume that NVMe over fabric users/developers will disagree with > > > you about such primary/secondary separation. > > > > The point both Sagi and Doug are making are valid. The primary > > users of hardware in the linux-rdma subsystem are user space > > applications. > > That is only true in certain verticals, particularly HPC. More > generally there are lots of verticals that care about RDMA only to run > kernel-based things like NVMeoF or SMB-Direct (windows) and never run > a user applcation. Arguably there actually are more customers by > number in the latter (though possibly more CPU cores/adaptors in the > former) I doubt that. After all, everything Jump Trading does is user space just like HPC. So is everything websphere low latency messaging based. I don't know an easy way to get an accurate analysis of this point, but I don't really think it matters all that much. Clearly, user space applications that don't need kernel support do exist and are a valid segment of our user base. Whether they are the majority, they are still valid. > The original point is still un-addressed. We, as a community, have > identified usnic as a bad idea that does not belong in > drivers/infiniband. This has been a pretty much universal position of > everyone who has worked on the core RDMA stack. I'll generally agree with this. It turns out usnic is not so much about access to a shared device with PD, QP, CQ, MR and AH support, but an SRIOV network device attached to a process with minimal (but sufficient I think) security measures to at least make sure that process can't listen in on other traffic. From there, it's just UDP on a private network adapter. > Do we want to reverse course on that and accept that usnic-like things > are actually OK? (ie no kverbs, no libibverbs and totally proprietary > everything) Even if we say the above is true, there are other factors to consider. Unlike usnic, efa follows much more closely to the existing rdma device model. It's not a separate device per process, it's a single shared device. It actually *does* have the concept of PD, CQ, QP, and AH. It doesn't do UDP, it actually does post_send and post_recv. So even if it appears usnic like in no kverbs and no libibverbs (at the moment), it differs greatly from usnic in these other traits. > I still haven't actually yet heard people saying yes to the above.. I think usnic falls too far outside of the RDMA model for my tastes. I'm not inclined to say we should reverse course on something that far apart being a good thing. But, the fact that efa follows the semantics of the RDMA subsystem when it comes to all the other traits I listed above makes me inclined to be more accepting of efa. If you add in a libibverbs provider and actual official, working UD support, then even more so. > The discussion about kernel support started because I said supporting > the existing RC ULPs *clearly* means the devices is not a 'usnic' > class driver. Agreed. That would be a clear indicator. But just because this is a clear differentiator does not mean lack of this is sufficient to indicate the opposite. See above for the various traits that make me think the efa adapter is not the same as usnic. > Nobody has presented another criteria that makes EFA and usnic any > different, so we are back to the start. Was usnic a bad idea or not? > How do we support 'usnic' style devices without wrecking the rest of > the stack? > > verbs is supposed to be a multi-vendor standard, not an enumeration of > every kind of proprietary device-specific behavior. What it's supposed to be...and what it is...well, that's evolving on a day to day basis, and it's already got plenty of proprietary, device- specific behavior. To be honest, I don't even consider libibverbs a viable programming interface anymore for the most part. What I mean by this is that libibverbs exists, and there are already existing consumers of libibverbs. I expect those things to continue. But, for the most part, people are encouraged now a days to use another interface at a higher level that then uses libibverbs behind the scenes: libfabric, UCX, MPIs, existing low latency middlewares, etc. Nobody I know if is telling people to write new code to libibverbs. When thought of that way, I'm very happy that when we pulled rdma-core together, we used the notion "anything that interacts with kernel device files and talks to the core RDMA kernel ABI" was appropriate for rdma- core. That way libibverbs can be the common layer all of the other things use to talk to the kernel while providing a more friendly API to the user than just raw verbs. But by that same token, if we are the common, device file opening, kernel interacting layer, then we must also be the place where all of the proprietary vendor specific stuff is implemented. And we already are. I cite all of the recent Mellanox Direct Verbs stuff as exactly that. It does raise the question of whether, to be consistent, we should have requested psm/psm2 also be part of rdma-core. Mellanox uses the driver direct model, Intel uses a separate cdev, and efa is apparently intending to follow Mellanox's example. I don't see a problem with that. > I'm kind of half wondering if the idea to hide 'usnic' devices from > the kernel ULPs doesn't go far enough - maybe we should have a > /dev/urdma for them as well? That would also go a long way to address > some of my past annoyance that hfi/qib have a private cdev.. > > > That's why usNIC doesn't care about kernel users. That's why EFA > > has no kernel support. > > I'd say they don't care because they aren't actually classic RDMA > devices, they are something new. > > > Requiring support for a specific user space library is basically > > pointless and unenforceable. Once the kABI is exposed, any software > > can write to it. > > It is probably a bad idea to implement the uverbs abi without copying > the new machinery for doing so from verbs, it is complicted, and the > ioctl support makes it even more tricky. I second the idea that trying to do the uverbs kernel API outside libibverbs is a bad idea. My preference would be to have a new libibverbs provider that supports what efa needs and then have libfabric use that, simply because of the issue you mention here. > Let alone the issue that libibverbs, assumes it has providers for all > the system devices and pukes if it doesn't - that will need fixing too > if we actually want to properly support this model. We can fix libibverbs if we ever do take a driver without a verbs driver, but I don't know that we are there yet, and I still think libibverbs as the shim to interface with the kernel is a good idea. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part