On Thu, 2015-05-14 at 12:49 -0700, sean.hefty@xxxxxxxxx wrote: > From: Ted Kim <ted.h.kim@xxxxxxxxxx> > > Problem reported by: Ted Kim <ted.h.kim@xxxxxxxxxx>: > > We have a case where a Linux system and a non-Linux system are > trying to interoperate. The Linux host is the active side and > starts the connection establishment, but later decides to not go > through with the connection setup and does rdma_destroy_id(). > > The rdma_destroy_id() eventually works its way down to cm_destroy_id() > in core/cm.c, where a REJ is sent. The non-Linux system > has some trouble recognizing the REJ because of: > > A. CM states which can't receive the REJ > B. Some issues about REJ formatting (missing comm ID) > > ISSUE A: That part of the spec says, a Consumer Reject REJ can be > sent for a connection abort, but it goes further > and says: can send a REJ message with a "Consumer Reject" > Reason code if they are in a CM state (i.e. REP > Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows > a REJ to be sent (lines 35-38). > > Of the states listed there in that sentence, it would > seem to limit the active side to using the Consumer Reject > (for the abort case) in just the REP-Rcvd and MRA-REP-Sent > states. That is basically only after the active side > sees a REP (or alternatively goes down the state transitions > to timeout in which case a Timeout REJ is sent). > > As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case > to the same as REQ-SENT. Essentially, make a REJ sent after > getting an MRA on active side a timeout rather than Consumer- > Reject, which is arguably more correct with the CM state > diagrams previous to getting a REP. > > Signed-off-by: Ted Kim <ted.h.kim@xxxxxxxxxx> > Signed-off-by: Sean Hefty <sean.hefty@xxxxxxxxx> I've picked this up for 4.2. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: 0E572FDD
Attachment:
signature.asc
Description: This is a digitally signed message part