Re: [PATCH rdma-next 00/13] Elastic Fabric Adapter (EFA) driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2019-01-03 at 14:22 -0700, Jason Gunthorpe wrote:
> On Thu, Jan 03, 2019 at 08:14:54PM +0000, Hefty, Sean wrote:
> > > I assume that NVMe over fabric users/developers will disagree with
> > > you about such primary/secondary separation.
> > 
> > The point both Sagi and Doug are making are valid.  The primary
> > users of hardware in the linux-rdma subsystem are user space
> > applications.
> 
> That is only true in certain verticals, particularly HPC. More
> generally there are lots of verticals that care about RDMA only to run
> kernel-based things like NVMeoF or SMB-Direct (windows) and never run
> a user applcation. Arguably there actually are more customers by
> number in the latter (though possibly more CPU cores/adaptors in the
> former)

I doubt that.  After all, everything Jump Trading does is user space
just like HPC.  So is everything websphere low latency messaging based. 
I don't know an easy way to get an accurate analysis of this point, but
I don't really think it matters all that much.  Clearly, user space
applications that don't need kernel support do exist and are a valid
segment of our user base.  Whether they are the majority, they are still
valid.

> The original point is still un-addressed. We, as a community, have
> identified usnic as a bad idea that does not belong in
> drivers/infiniband. This has been a pretty much universal position of
> everyone who has worked on the core RDMA stack.

I'll generally agree with this.  It turns out usnic is not so much about
access to a shared device with PD, QP, CQ, MR and AH support, but an
SRIOV network device attached to a process with minimal (but sufficient
I think) security measures to at least make sure that process can't
listen in on other traffic.  From there, it's just UDP on a private
network adapter.

> Do we want to reverse course on that and accept that usnic-like things
> are actually OK? (ie no kverbs, no libibverbs and totally proprietary
> everything)

Even if we say the above is true, there are other factors to consider. 
Unlike usnic, efa follows much more closely to the existing rdma device
model.  It's not a separate device per process, it's a single shared
device.  It actually *does* have the concept of PD, CQ, QP, and AH.  It
doesn't do UDP, it actually does post_send and post_recv.  So even if it
appears usnic like in no kverbs and no libibverbs (at the moment), it
differs greatly from usnic in these other traits.

> I still haven't actually yet heard people saying yes to the above..

I think usnic falls too far outside of the RDMA model for my tastes. 
I'm not inclined to say we should reverse course on something that far
apart being a good thing.  But, the fact that efa follows the semantics
of the RDMA subsystem when it comes to all the other traits I listed
above makes me inclined to be more accepting of efa.  If you add in a
libibverbs provider and actual official, working UD support, then even
more so.

> The discussion about kernel support started because I said supporting
> the existing RC ULPs *clearly* means the devices is not a 'usnic'
> class driver.

Agreed.  That would be a clear indicator.  But just because this is a
clear differentiator does not mean lack of this is sufficient to
indicate the opposite.  See above for the various traits that make me
think the efa adapter is not the same as usnic.

> Nobody has presented another criteria that makes EFA and usnic any
> different, so we are back to the start. Was usnic a bad idea or not?
> How do we support 'usnic' style devices without wrecking the rest of
> the stack?
> 
> verbs is supposed to be a multi-vendor standard, not an enumeration of
> every kind of proprietary device-specific behavior.

What it's supposed to be...and what it is...well, that's evolving on a
day to day basis, and it's already got plenty of proprietary, device-
specific behavior.

To be honest, I don't even consider libibverbs a viable programming
interface anymore for the most part.  What I mean by this is that
libibverbs exists, and there are already existing consumers of
libibverbs.  I expect those things to continue.  But, for the most part,
people are encouraged now a days to use another interface at a higher
level that then uses libibverbs behind the scenes: libfabric, UCX, MPIs,
existing low latency middlewares, etc.  Nobody I know if is telling
people to write new code to libibverbs.

When thought of that way, I'm very happy that when we pulled rdma-core
together, we used the notion "anything that interacts with kernel device
files and talks to the core RDMA kernel ABI" was appropriate for rdma-
core.  That way libibverbs can be the common layer all of the other
things use to talk to the kernel while providing a more friendly API to
the user than just raw verbs.

But by that same token, if we are the common, device file opening,
kernel interacting layer, then we must also be the place where all of
the proprietary vendor specific stuff is implemented.  And we already
are.  I cite all of the recent Mellanox Direct Verbs stuff as exactly
that.  It does raise the question of whether, to be consistent, we
should have requested psm/psm2 also be part of rdma-core.  Mellanox uses
the driver direct model, Intel uses a separate cdev, and efa is
apparently intending to follow Mellanox's example.  I don't see a
problem with that. 

> I'm kind of half wondering if the idea to hide 'usnic' devices from
> the kernel ULPs doesn't go far enough - maybe we should have a
> /dev/urdma for them as well? That would also go a long way to address
> some of my past annoyance that hfi/qib have a private cdev..
> 
> > That's why usNIC doesn't care about kernel users.  That's why EFA
> > has no kernel support.  
> 
> I'd say they don't care because they aren't actually classic RDMA
> devices, they are something new.
> 
> > Requiring support for a specific user space library is basically
> > pointless and unenforceable.  Once the kABI is exposed, any software
> > can write to it.
> 
> It is probably a bad idea to implement the uverbs abi without copying
> the new machinery for doing so from verbs, it is complicted, and the
> ioctl support makes it even more tricky.

I second the idea that trying to do the uverbs kernel API outside
libibverbs is a bad idea.  My preference would be to have a new
libibverbs provider that supports what efa needs and then have libfabric
use that, simply because of the issue you mention here.

> Let alone the issue that libibverbs, assumes it has providers for all
> the system devices and pukes if it doesn't - that will need fixing too
> if we actually want to properly support this model.

We can fix libibverbs if we ever do take a driver without a verbs
driver, but I don't know that we are there yet, and I still think
libibverbs as the shim to interface with the kernel is a good idea.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux