RE: [PATCH rdma-next 0/3] Support out of order data placement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom,


> -----Original Message-----
> From: Tom Talpey [mailto:tom@xxxxxxxxxx]
> Sent: Monday, June 12, 2017 6:44 PM
> To: Parav Pandit <parav@xxxxxxxxxxxx>; Jason Gunthorpe
> <jgunthorpe@xxxxxxxxxxxxxxxxxxxx>
> Cc: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>; leon@xxxxxxxxxx;
> dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Idan Burstein
> <idanb@xxxxxxxxxxxx>
> Subject: Re: [PATCH rdma-next 0/3] Support out of order data placement
> 
> On 6/12/2017 6:54 PM, Parav Pandit wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Talpey [mailto:tom@xxxxxxxxxx]
> >> Sent: Monday, June 12, 2017 5:20 PM
> >> To: Parav Pandit <parav@xxxxxxxxxxxx>; Jason Gunthorpe
> >> <jgunthorpe@xxxxxxxxxxxxxxxxxxxx>
> >> Cc: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>; leon@xxxxxxxxxx;
> >> dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Idan Burstein
> >> <idanb@xxxxxxxxxxxx>
> >> Subject: Re: [PATCH rdma-next 0/3] Support out of order data
> >> placement
> >>
> >> On 6/12/2017 5:32 PM, Parav Pandit wrote:
> >>> Hi Tom,
> >> ...
> >>>>
> >>>> I agree with Jason, the bit should be 1 by default, if defined as
> >>>> you
> >> propose.
> >>>> Out-of-order is the norm, not the exception, for ULPs.
> >>>> Honestly, I think you should perhaps consider making it the default
> >>>> on your devices, and allowing only MLX-aware ULPs to turn it off.
> >>>>
> >>>
> >>> There can be cases in deployment where responder has support for
> >> receiving out-of-order, but requester doesn't.
> >>
> >> Yuck! So this needs to be negotiated end-to-end, and by the upper layer?
> >> Talk about barriers to adoption, and opportunities for disaster.
> >>
> > As Jason confirmed that all Linux kernel consumers are coded to be
> > compliant to o9-20 requirement, So I think kernel based rdma-cm
> consumers can be transparently enabled end-to-end without ULP's
> involvement with rdma_accept() and rdma_connect().
> 
> I have two thoughts here.
> 
> 1) You seem to assume all consumers are Linux, and do not need to
> negotiate. This is a dangerous assumption.
Certainly not. I didn't assume that. I just gave one example that known consumers can be done without modifying the ULP.
Explained further in 3rd question.
Even other consumers can work with this solution. 
For example Linux rdmacm based client and Other OS based server.
Client is ooo capable.
Server is ooo not capable.
Once you follow below rdmacm based sequence, it will be clear how this will works.

> 
> 2) I assume that there is some performance benefit to toggling this setting
> to non-strict. So, how do existing consumers get this advantage, especially
> since they don't need strict semantics? Bearing in mind that they do have
> to negotiate this end-to-end, meaning they require a protocol extension.
I don't have completely transparent upstream solution for existing consumers yet.
> 
> Actually. I have a third thought. Since this is an attribute to qp creation,
> performed even before establishing a connection, how does the upper layer
> know when to set it?
This is not at QP creation time. I have described in Documentation/out_of_order.txt in usage section 3.
This is at QP state transition from INIT to RTR.
Here is the flow. It's just not coded enough for posting patches.

1. When rdmacm active side creates the QP, It is INIT state.
2. Send MAD_Req msg (indicating ooo_requested=1)
3. When rdmacm passive side receives the message, it looks up device_cap attribute and matches it against ooo_requested flag.
4. when device supports it, MAD_Rsp msg sets ooo_enabled=1, if it doesn't support it, ooo_enabled=0
5. rdmacm passive side creates the QP and moves to RTR state (with QP ooo enabled bit set).
6. active side receives the message and puts the QP to RTR, RTS state based on received bit setting from passive side.

Flow is no different than how rest of the connection specific parameters are shared such as IRD/ORD, PSN, timeouts, mtu etc.



> 
> Tom.
��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux