Re: [PATCH 14/14] IB/mad: Add final OPA MAD processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/14/2015 03:16 PM, Liran Liss wrote:
>> From: Doug Ledford [mailto:dledford@xxxxxxxxxx]
> 
>>> But the node_type stands for more than just an abstract RDMA device:
>>> In IB, it designates an instance of an industry-standard, well-defined,
>> device type: it's possible link types, transport, semantics, management,
>> everything.
>>> It *should* be exposed to user-space so apps that know and care what
>> they are running on could continue to work.
>>
>> I'm sorry, but your argument here is not very convincing at all.  And
>> it's somewhat hypocritical.  When RoCE was first introduced, the *exact*
>> same argument could be used to argue for why RoCE should require a new
>> node_type.  Except then, because RoCE was your own, you argued for, and
>> got, an expansion of the IB node_type definition that now included a
>> relevant link_layer attribute that apps never needed to care about
>> before.  However, now you are a victim of your own success.  You set the
>> standard then that if the new device can properly emulate an IB Verbs/IB
>> Link Layer device in terms of A) supported primitives (iWARP and usNIC
>> both fail here, and hence why they have their own node_types) and B)
>> queue pair creation process modulo link layer specific addressing
>> attributes, then that device qualifies to use the IB_CA node_type and
>> merely needs only a link_layer attribute to differentiate it.
>>
> 
> No. RoCE is as an open standard from the IBTA with the exact same RDMA protocol semantics as InfiniBand and a clear set of compliancy rules without which an implementation can't claim to be such. A RoCE device *is* an IB CA with an Ethernet link.
> In contrast, OPA is a proprietary protocol. We don't know what primitives are supported, and whether the semantics of supported primitives are the same as in InfiniBand.

Intel has stated on this list that they intend for RDMA apps to run on
OPA transparently.  That pretty much implies the list of primitives and
everything else that they must support.  However, time will tell if they
succeeded or not.

>> The new OPA stuff appears to be following *exactly* the same development
>> model/path that RoCE did.  When RoCE was introduced, all the apps that
>> really cared about low level addressing on the link layer had to be
>> modified to encompass the new link type.  This is simply link_layer
>> number three for apps to care about.
>>
> 
> You are missing my point. API transparency is not a synonym for full semantic equivalence.  The Node Type doesn’t indicate level of adherence to an API. Node Type indicates compliancy to a  specification (e.g. wire protocol, remote order of execution, error semantics, architectural limitations, etc). The IBTA CA and Switch Node Types belong to devices that are compliant to the corresponding specifications from the InfiniBand Trade Association.  And that doesn’t prevent applications to choose to be coded to run over nodes of different Node Type as it happens today with IB/RoCE and iWARP.
> 
> This has nothing to do with addressing.

And whether you like it or not, Intel is intentionally creating a
device/fabric with the specific intention of mimicking the IB_CA device
type (with stated exceptions for MAD packets and addresses).  They
obviously won't have certification as an IB_CA, but that's not their
aim.  Their aim is to be a functional drop in replacement that apps
don't need to know about except for the stated exceptions.

And I'm not missing your point.  Your point is inappropriate.  You're
trying to conflate certification with a functional API.  The IB_CA node
type is not an official certification of anything, and the linux kernel
is not an official certifying body for anything.  If you want
certification, you go to the OFA and the UNH-IOL testing program.
There, you have the rights to the certification branding logo and you
have the right to deny access to that logo to anyone that doesn't meet
the branding requirements.

You're right that apps can be coded to other CA types, like RNICs and
USNICs.  However, those are all very different from an IB_CA due to
limited queue pair types or limited primitives.  If OPA had that same
limitation then I would agree it needs a different node type.

So this will be my litmus test.  Currently, an app that supports all of
the RDMA types looks like this:

if (node_type == RNIC)
	do iwarpy stuff
else if (node_type == USNIC)
	do USNIC stuff
else if (node_type == IB_CA)
	do IB verbs stuff
	if (link_layer == Ethernet)
		do RoCE addressing/management
	else
		do IB addressing/management



If, in the end, apps that are modified to support OPA end up looking
like this:

if (node_type == RNIC)
	do iwarpy stuff
else if (node_type == USNIC)
	do USNIC stuff
else if (node_type == IB_CA || node_type == OPA_CA)
	do IB verbs stuff
	if (node_type == OPA_CA)
		do OPA addressing/management
	else if (link_layer == Ethernet)
		do RoCE addressing/management
	else
		do IB addressing/management

where you can plainly see that the exact same goal can be accomplished
whether you have an OPA node_type or an IB_CA node_type + OPA
link_layer, then I will be fine with either a new node_type or a new
link_layer.  They will be functionally equivalent as far as I'm concerned.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: 0E572FDD


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux