On 2018/7/20 18:26, Yuval Shaia wrote: > On Thu, Jul 19, 2018 at 08:23:17PM +0800, Yixian Liu wrote: >> The cqe should be flushed error completion status if an related >> error is detected while poll cqe, post send or post recv. >> >> Record doorbell is used to notify the head pointer of sq and rq >> to the kernel. >> >> Signed-off-by: Yixian Liu <liuyixian@xxxxxxxxxx> >> --- >> kernel-headers/rdma/hns-abi.h | 1 + >> providers/hns/hns_roce_u.h | 1 + >> providers/hns/hns_roce_u_hw_v2.c | 53 ++++++++++++++++++++++++++++++++++++++++ >> providers/hns/hns_roce_u_hw_v2.h | 1 + >> providers/hns/hns_roce_u_verbs.c | 24 +++++++++++++++--- >> 5 files changed, 77 insertions(+), 3 deletions(-) >> >> diff --git a/kernel-headers/rdma/hns-abi.h b/kernel-headers/rdma/hns-abi.h >> index 78613b6..c1f8773 100644 >> --- a/kernel-headers/rdma/hns-abi.h >> +++ b/kernel-headers/rdma/hns-abi.h >> @@ -53,6 +53,7 @@ struct hns_roce_ib_create_qp { >> __u8 log_sq_stride; >> __u8 sq_no_prefetch; >> __u8 reserved[5]; >> + __aligned_u64 sdb_addr; >> }; >> >> struct hns_roce_ib_create_qp_resp { >> diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h >> index 8426569..2b2070f 100644 >> --- a/providers/hns/hns_roce_u.h >> +++ b/providers/hns/hns_roce_u.h >> @@ -211,6 +211,7 @@ struct hns_roce_qp { >> struct hns_roce_wq sq; >> struct hns_roce_wq rq; >> uint32_t *rdb; >> + uint32_t *sdb; >> struct hns_roce_sge_ex sge; >> unsigned int next_sge; >> int port_num; >> diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c >> index ca59011..588a946 100644 >> --- a/providers/hns/hns_roce_u_hw_v2.c >> +++ b/providers/hns/hns_roce_u_hw_v2.c >> @@ -237,6 +237,9 @@ static void hns_roce_v2_clear_qp(struct hns_roce_context *ctx, uint32_t qpn) >> ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL; >> } >> >> +static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, >> + int attr_mask); >> + >> static int hns_roce_v2_poll_one(struct hns_roce_cq *cq, >> struct hns_roce_qp **cur_qp, struct ibv_wc *wc) >> { >> @@ -248,6 +251,9 @@ static int hns_roce_v2_poll_one(struct hns_roce_cq *cq, >> struct hns_roce_v2_cqe *cqe = NULL; >> struct hns_roce_rinl_sge *sge_list; >> uint32_t opcode; >> + struct ibv_qp_attr attr; >> + int attr_mask; >> + int ret; >> >> /* According to CI, find the relative cqe */ >> cqe = next_cqe_sw_v2(cq); >> @@ -314,6 +320,19 @@ static int hns_roce_v2_poll_one(struct hns_roce_cq *cq, >> if (roce_get_field(cqe->byte_4, CQE_BYTE_4_STATUS_M, >> CQE_BYTE_4_STATUS_S) != HNS_ROCE_V2_CQE_SUCCESS) { >> hns_roce_v2_handle_error_cqe(cqe, wc); >> + >> + /* flush cqe */ >> + if ((wc->status != IBV_WC_SUCCESS) && >> + (wc->status != IBV_WC_WR_FLUSH_ERR)) { >> + attr_mask = IBV_QP_STATE; >> + attr.qp_state = IBV_QPS_ERR; >> + ret = hns_roce_u_v2_modify_qp(&(*cur_qp)->ibv_qp, >> + &attr, attr_mask); >> + if (ret) { >> + fprintf(stderr, PFX "failed to modify qp!\n"); > > I do not understand why poll_one and got the honor of printing while > post_send and post_recv are not. > > Is it because of caller taking care of it? > > Suggesting consistency. > Hi Yuval, Thanks for your comment! I will fix it and keep consistent in next version. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html