Re: [PATCH rdma-next 00/13] Elastic Fabric Adapter (EFA) driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21-Dec-18 05:15, Jason Gunthorpe wrote:
> On Fri, Dec 21, 2018 at 01:28:14AM +0000, Hefty, Sean wrote:
>>>> I agree.  It has the general abstract hardware semantics that I
>>>> associated as 'verbs' - PD, MR, AH, QP, CQ.
>>>
>>> All most all modern high performance hardware has some variation on
>>> QP's and CQ's.
>>
>> I draw a distinction between command queues and QPs and whether the
>> addressing is closely coupled with it.
> 
> It doesn't sound like SRD has addressing associated with the QP, as it
> is unconnected. Addressing is surely associated with the individual
> command?
> 
> Whats the difference between a command queue and a verbs QP? Verbs QPs
> have the verbs defined properties and states. (notice EFA
> hardwired/ignored alot of these). It has the verbs state machines. QPs
> execute *verbs* actions (SEND/RECV/READ/WRITE/etc).

EFA queues are not command queues, these are verbs queues that support SEND/RECV
operations using AH and QP numbers for addressing and MR lkeys for local buffer
access.
We are missing the QP states at the moment, but support for these will be added
in the firmware.

> 
> Should 'rdma' be a subsystem that allows any arbitary process isolated
> command queues to be exposed to user space?? That is what USNIC is
> doing. But others have said no to this, they want to build a new
> subsystem more like vfio.>
>>> It doesn't have rkeys, right? So the PD&MR is just some variation of
>>> lkey gather/scatter from user memory??
>>
>> The PD also controls the AH.
> 
> Hmm? Not really in verbs. Is EFA doing something different?
> 
>> You could argue that it's unnecessary to group QPs, MRs, and AHs by
>> PDs, but that does match verb semantics.  The PD provides process
>> isolation.
> 
> verbs semantics require the PD grouping because it is a security boundary
> for the rkey. The verbs user must have control over the PD to ensure
> that multiple remote connections in the same process have isolated
> rkeys from each other.
> 
> If it wasn't for the rkey then all those objects would be simply
> grouped to the verbs FD with no PD.

We don't support RC/READ/WRITE so rkeys aren't existent in EFA, but we do use
PDs (and lkeys) to group the different objects according to the verbs semantics.

> 
>>> A MR without an rkey capability is not a MR, we've been calling that
>>> object a 'umem' in verbs lately.
>>
>> Well, the IB spec still calls this a MR.  :b
> 
> And AF_XDP calls it a umem and vfio calls it a 'dma map'.
> 
>>> This is my objection to these drivers, the device does not implement
>>> anything close to the expected verbs semantics for any of the objects
>>> it seems to use.
>>
>> It looks similar to an IB device that only supports UD QPs.
> 
> Gal already said it doesn't support 'verbs' UD. It supports two kinds
> of proprietary QPs, as far as I can tell. One is sort of like UD,
> maybe.

Sorry for not being clear about this, verbs UD will be supported.

> 
>> You honestly don't think you could write an app against SRD knowing
>> nothing more than you know now?
> 
> More than *I* know? Certainly not! I assume you've been reviewing
> their libfabric driver, so I think you know alot more than I do..
> 
> .. and that isn't the standard for defining a verb extension anyhow.

You're right, in addition to our userspace provider we will provide a functional
specification of SRD.

> 
>> Yes, the documentation could be improved to clarify the semantics,
>> but I do think a fair attempt could be made to program an app to use
>> SRD.
> 
> The semantics would need to be documented to a level that another
> vendor could make a compatible implementation - at a *minimum*.

I agree, will add.

> 
> .. and that ignores the issue that the wire protocol is proprietary,
> which we haven't ignored in the past.

I get your concerns, but how does the lack of documentation for the wire
protocol in EC2 network impact our users? What information that affects EFA's
usage is missing?
Every customer visible feature will be documented.

> 
> .. and that they have stated no intention to actually provide a verbs
> API for this functionality, only doing libfabric, it seems.

Sorry for not being clear about this as well, we do have intentions for a verbs
API :).
We have a libfabric provider which uses the verbs API at the moment, but we can
implement this through libibverbs as well.

> 
> So I see this the same as usnic. A non-verbs proprietary protocol that
> is used to build something more complicated in userspace.

I'm not familiar with what usnic are doing, but this is not what EFA is about.
This will be more clear in our userspace bits.

> 
> This is in contrast to QIB/HFI that used their non-verbs HW protocol
> to build verbs in SW, in the kernel.

I don't think our case is the same as qib/hfi, I do consider our device as
verbs, but we can add a software implementation for missing verbs functionality
(like RC) - it's a functionality our customers aren't interested in at the moment.

> 
>>>> I still view SRD as a generic QP type for reliable-unconnected send
>>>> and receive messages.
>>>
>>> Something proprietary and undocumented cannot be part of common
>>> verbs. I am absolutely adament on this. We cannot maintain the core
>>> code as a patchwork of undocumented driver-specific crap.
>>
>> I'm not sure why something proprietary is considered crap.  That
>> seems unnecessarily judgmental.  
> 
> Have you looked at the usnic kernel driver? :(
> 
>> The behavior of SRD can be described in more detail, even keeping
>> the protocol proprietary.
> 
> Sure, but this is a line we have not yet crossed in RDMA.

Does proprietary implementation mean proprietary protocol?

> 
>>> Mellanox already has a reliable delivery unconnected QP type that
>>> supports RDMA. It is exposed out of the driver as IBV_QPT_DRIVER
>>> because the protocol and behavior isn't documented.
>>
>> I would like to see this be defined generically as well, and really
>> anything that supports verbs in an 'intuitive' way.
> 
> Generally someone should use UCX to access this protocol, not
> verbs.. Just like you should use (apparently) libfabric to access SRD,
> not verbs.
> 
> I'm not sure who it benefits to document SRD just to justify being
> verbs to get the uapi when nothing in the kernel or userspace will use
> the 'SRD verbs'.

How is SRD different than DC? We'll expose a direct verb to introduce the new QP
type.
If the specification for SRD will suffice, we can implement it through standard
ibv_create_qp.

> 
>>> I don't see any difference with EFA's proprietary protocol.
>>
>> I still don't see the point of having every driver use value 255 to
>> indicate that the real value is some other location.  
> 
> It indicates it is not supported by the core stack, and will not be
> supported by any other driver. Seeing QPT_DRIVER outside the driver/hw
> directory is a huge red flag that patches will be NAK'd - ie ULPs
> can't use QPT_DRIVER. It is the appropriate placement for something
> expected to be propriety.
> 
> If there is an industry will to standardize then the definition can
> change.
> 
>> We're not constrained on enum values, and this just makes it more
>> likely that a request can get forwarded to the wrong driver.  
> 
> ioctl has protections against this already.
> 
> Look, this is a very strange situation. Since QIB/HFI PSM we have been
> allowing drivers to expose their own command channels to userspace
> that are proprietary. That isn't a problem, per say.
> 
> The objection is using the RDMA subsystem to access the uapi channel
> when the device isn't an RDMA device and doesn't provide verbs ops
> toward the kernel.
> 
> This has confused/complicated a lot of the kernel work because we see
> bad/confusing/non-standard implementations of verbs objects that don't
> act the way they should. In the past this kernel work has just said
> 'ignore usnic, it should even be here' - and honestly there is a bit
> of a "shrug, it is cisco's problem if usnic somehow breaks" because it
> is so incredibly far from all the other drivers.
> 
> I said it the last time this came up, if we want to go in this
> direction we should consider that these devices are somehow entirely
> different from the verbs oriented stack, and maybe they don't abuse
> the kernel facing verbs APIs in their quest to get access to user
> space.
> 
> But I haven't seen a proposal for something like this.. Even something
> fairly simple like declaring them a NON_VERBS device and hiding them
> from the in-kernel APIs might be enough. But them I'm not sure how we
> justify using the common verbs uAPI..

I can come up with a proposal for hiding EFA from in-kernel APIs, although I do
believe we will support some sort of kernel API in the future.
Regardless, I don't think that means we should declare EFA as a NON_VERBS device.

Also, we can implement a UD API for the kernel, it's not a smart thing to do
since there are no kernel users for that, but that would be providing verbs ops
towards the kernel and clear the non-standard implementation suspicion.

> 
> Mellanox and HFI both defined their own non-verbs uAPIs, maybe that is
> more appropriate here. It is hard to say when there is so little
> public information <shrug> 
> 
> Jason
> 



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux