> -----Original Message----- > From: Tom Talpey > Sent: Monday, August 14, 2017 12:55 PM > To: Long Li <longli@xxxxxxxxxxxxx>; Steve French <sfrench@xxxxxxxxx>; > linux-cifs@xxxxxxxxxxxxxxx; samba-technical@xxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx > Subject: RE: [[PATCH v1] 05/37] [CIFS] SMBD: Implement API for upper layer > to create SMBD transport and establish RDMA connection > > > -----Original Message----- > > From: linux-cifs-owner@xxxxxxxxxxxxxxx [mailto:linux-cifs- > > owner@xxxxxxxxxxxxxxx] On Behalf Of Long Li > > Sent: Wednesday, August 2, 2017 4:10 PM > > To: Steve French <sfrench@xxxxxxxxx>; linux-cifs@xxxxxxxxxxxxxxx; > > samba- technical@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > > Cc: Long Li <longli@xxxxxxxxxxxxx> > > Subject: [[PATCH v1] 05/37] [CIFS] SMBD: Implement API for upper layer > > to create SMBD transport and establish RDMA connection > > > > From: Long Li <longli@xxxxxxxxxxxxx> > > > > Implement the code for connecting to SMBD server. The client and > > server are connected using RC Queue Pair over RDMA API, which > > suppports Infiniband, RoCE and iWARP. Upper layer code can call > > cifs_create_rdma_session to establish a SMBD RDMA connection. > > > > +/* Upcall from RDMA CM */ > > +static int cifs_rdma_conn_upcall( > > + struct rdma_cm_id *id, struct rdma_cm_event *event) { > > + struct cifs_rdma_info *info = id->context; > > + > > + log_rdma_event("event=%d status=%d\n", event->event, > > + event->status); > > + > > + switch (event->event) { > > + case RDMA_CM_EVENT_ADDR_RESOLVED: > > + case RDMA_CM_EVENT_ROUTE_RESOLVED: > > + info->ri_rc = 0; > > + complete(&info->ri_done); > > + break; > > + > > + case RDMA_CM_EVENT_ADDR_ERROR: > > + info->ri_rc = -EHOSTUNREACH; > > + complete(&info->ri_done); > > + break; > > + > > + case RDMA_CM_EVENT_ROUTE_ERROR: > > + info->ri_rc = -ENETUNREACH; > > + complete(&info->ri_done); > > + break; > > + > > + case RDMA_CM_EVENT_ESTABLISHED: > > + case RDMA_CM_EVENT_CONNECT_ERROR: > > + case RDMA_CM_EVENT_UNREACHABLE: > > + case RDMA_CM_EVENT_REJECTED: > > + case RDMA_CM_EVENT_DEVICE_REMOVAL: > > + log_rdma_event("connected event=%d\n", event->event); > > + info->connect_state = event->event; > > + break; > > + > > + case RDMA_CM_EVENT_DISCONNECTED: > > + break; > > + > > + default: > > + break; > > + } > > + > > + return 0; > > +} > > This code looks a lot like the connection stuff in the NFS/RDMA RPC transport. > Does your code have the same needs? If so, you might consider moving this > to a common RDMA handler. > > > +/* Upcall from RDMA QP */ > > +static void > > +cifs_rdma_qp_async_error_upcall(struct ib_event *event, void > > +*context) { > > + struct cifs_rdma_info *info = context; > > + log_rdma_event("%s on device %s info %p\n", > > + ib_event_msg(event->event), event->device->name, > > +info); > > + > > + switch (event->event) > > + { > > + case IB_EVENT_CQ_ERR: > > + case IB_EVENT_QP_FATAL: > > + case IB_EVENT_QP_REQ_ERR: > > + case IB_EVENT_QP_ACCESS_ERR: > > + > > + default: > > + break; > > + } > > +} > > Ditto. But, what's up with the empty switch(event->event) processing? I have changed to code to disconnect RDMA connection on QP errors. > > > +static struct rdma_cm_id* cifs_rdma_create_id( > > + struct cifs_rdma_info *info, struct sockaddr *dstaddr) > > +{ > ... > > + log_rdma_event("connecting to IP %pI4 port %d\n", > > + &addr_in->sin_addr, ntohs(addr_in->sin_port)); > >... and then... > > + if (dstaddr->sa_family == AF_INET6) > > + sport = &((struct sockaddr_in6 *)dstaddr)->sin6_port; > > + else > > + sport = &((struct sockaddr_in *)dstaddr)->sin_port; > > + > > + *sport = htons(445); > ...and > > +out: > > + // try port number 5445 if port 445 doesn't work > > + if (*sport == htons(445)) { > > + *sport = htons(5445); > > + goto try_again; > > + } > > Suggest rearranging the log_rdma_event() call to reflect reality. > > The IANA-assigned port for SMB Direct is 5445, and port 445 will be listening > on TCP. Should you really be probing that port before 5445? > I suggest not doing so unconditionally. This part is reworked in V3 to behave as you suggested. > > > +struct cifs_rdma_info* cifs_create_rdma_session( > > + struct TCP_Server_Info *server, struct sockaddr *dstaddr) { > > ... > > + int max_pending = receive_credit_max + send_credit_target; > >... > > + if (max_pending > info->id->device->attrs.max_cqe || > > + max_pending > info->id->device->attrs.max_qp_wr) { > > + log_rdma_event("consider lowering receive_credit_max and " > > + "send_credit_target. Possible CQE overrun, device " > > + "reporting max_cpe %d max_qp_wr %d\n", > > + info->id->device->attrs.max_cqe, > > + info->id->device->attrs.max_qp_wr); > > + goto out2; > > + } > > I don't understand this. Why are you directing both Receive and Send > completions to the same CQ, won't that make it very hard to manage > completions and their interrupts? Also, what device(s) have you seen trigger > this log? CQ's are generally allowed to be quite large. I have moved them to separate completion queues in V3. > > > + conn_param.responder_resources = 32; > > + if (info->id->device->attrs.max_qp_rd_atom < 32) > > + conn_param.responder_resources = > > + info->id->device->attrs.max_qp_rd_atom; > > + conn_param.retry_count = 6; > > + conn_param.rnr_retry_count = 6; > > + conn_param.flow_control = 0; > > These choices warrant explanation. 32 responder resources is on the large > side for most datacenter networks. And because of SMB Direct's credits, why > is RNR_retry not simply zero? > Those have been fixed. On Mellanox ConnectX3, the attrs.max_qp_rd_atom is 16. So I choose a default value 32 to work with hardware that can potentially support more responder resources. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html