Zhu, Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. Perhaps it would be a good idea to apply the following patch which would tell us which of the three calls to rxe_mr_copy is failing. My suspicion is the one in read_reply() in rxe_resp.c This could be caused by a race between shutting down the qp and finishing up an RDMA read. The responder resources state machine is completely unprotected from simultaneous access by verbs code and bh code in rxe_resp.c. rxe_resp is a tasklet so all the accesses from there are serialized but if anyone makes a verbs call that touches the responder resources it could cause problems. The most likely (only?) place this could happen is qp shutdown. Bob diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 60a31b718774..66184f5a4ddf 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -489,6 +489,7 @@ int copy_data( if (bytes > 0) { iova = sge->addr + offset; + WARN_ON(!mr); err = rxe_mr_copy(mr, iova, addr, bytes, dir); if (err) goto err2; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 1d95fab606da..6e3e86bdccd7 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct rxe_qp *qp, int err; int data_len = payload_size(pkt); + WARN_ON(!qp->resp.mr); err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, payload_addr(pkt), data_len, RXE_TO_MR_OBJ); if (err) { @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, if (!skb) return RESPST_ERR_RNR; + WARN_ON(!mr); err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), payload, RXE_FROM_MR_OBJ); if (err)