Re: [PATCH rdma-next 0/3] Support out of order data placement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/12/2017 7:59 PM, Parav Pandit wrote:
-----Original Message-----
From: Tom Talpey [mailto:tom@xxxxxxxxxx]
Sent: Monday, June 12, 2017 6:44 PM
To: Parav Pandit <parav@xxxxxxxxxxxx>; Jason Gunthorpe
<jgunthorpe@xxxxxxxxxxxxxxxxxxxx>
Cc: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>; leon@xxxxxxxxxx;
dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Idan Burstein
<idanb@xxxxxxxxxxxx>
Subject: Re: [PATCH rdma-next 0/3] Support out of order data placement

On 6/12/2017 6:54 PM, Parav Pandit wrote:
Hi Tom,

-----Original Message-----
From: Tom Talpey [mailto:tom@xxxxxxxxxx]
Sent: Monday, June 12, 2017 5:20 PM
To: Parav Pandit <parav@xxxxxxxxxxxx>; Jason Gunthorpe
<jgunthorpe@xxxxxxxxxxxxxxxxxxxx>
Cc: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>; leon@xxxxxxxxxx;
dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Idan Burstein
<idanb@xxxxxxxxxxxx>
Subject: Re: [PATCH rdma-next 0/3] Support out of order data
placement

On 6/12/2017 5:32 PM, Parav Pandit wrote:
Hi Tom,
...

I agree with Jason, the bit should be 1 by default, if defined as
you
propose.
Out-of-order is the norm, not the exception, for ULPs.
Honestly, I think you should perhaps consider making it the default
on your devices, and allowing only MLX-aware ULPs to turn it off.


There can be cases in deployment where responder has support for
receiving out-of-order, but requester doesn't.

Yuck! So this needs to be negotiated end-to-end, and by the upper layer?
Talk about barriers to adoption, and opportunities for disaster.

As Jason confirmed that all Linux kernel consumers are coded to be
compliant to o9-20 requirement, So I think kernel based rdma-cm
consumers can be transparently enabled end-to-end without ULP's
involvement with rdma_accept() and rdma_connect().

I have two thoughts here.

1) You seem to assume all consumers are Linux, and do not need to
negotiate. This is a dangerous assumption.
Certainly not. I didn't assume that. I just gave one example that known consumers can be done without modifying the ULP.
Explained further in 3rd question.
Even other consumers can work with this solution.
For example Linux rdmacm based client and Other OS based server.
Client is ooo capable.
Server is ooo not capable.
Once you follow below rdmacm based sequence, it will be clear how this will works.

Oh, so there's a MAD protocol change under the hood. Well,
that's a wider question. And I still don't understand how
existing, non-strict-requiring protocols can take advantage
of this. Nor how this works for non-Mellanox, non-IB/RoCE
implementations.

Again, I'd be a lot less concerned if non-strict were the
default, and strict mode was negotiated. It's all just so
upside-down.

Tom.

2) I assume that there is some performance benefit to toggling this setting
to non-strict. So, how do existing consumers get this advantage, especially
since they don't need strict semantics? Bearing in mind that they do have
to negotiate this end-to-end, meaning they require a protocol extension.
I don't have completely transparent upstream solution for existing consumers yet.

Actually. I have a third thought. Since this is an attribute to qp creation,
performed even before establishing a connection, how does the upper layer
know when to set it?
This is not at QP creation time. I have described in Documentation/out_of_order.txt in usage section 3.
This is at QP state transition from INIT to RTR.
Here is the flow. It's just not coded enough for posting patches.

1. When rdmacm active side creates the QP, It is INIT state.
2. Send MAD_Req msg (indicating ooo_requested=1)
3. When rdmacm passive side receives the message, it looks up device_cap attribute and matches it against ooo_requested flag.
4. when device supports it, MAD_Rsp msg sets ooo_enabled=1, if it doesn't support it, ooo_enabled=0
5. rdmacm passive side creates the QP and moves to RTR state (with QP ooo enabled bit set).
6. active side receives the message and puts the QP to RTR, RTS state based on received bit setting from passive side.

Flow is no different than how rest of the connection specific parameters are shared such as IRD/ORD, PSN, timeouts, mtu etc.




Tom.
N�����r��y���b�X��ǧv�^�)޺{.n�+����{��ٚ�{ay�ʇڙ�,j��f���h���z��w������j:+v���w�j�m��������zZ+�����ݢj"��!tml=

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux