On Thu, Jan 11, 2018 at 07:07:06PM +0000, Bart Van Assche wrote: > On Thu, 2018-01-11 at 21:00 +0200, Leon Romanovsky wrote: > > On Thu, Jan 11, 2018 at 04:02:33PM +0000, Bart Van Assche wrote: > > > On Thu, 2018-01-11 at 08:22 +0200, Leon Romanovsky wrote: > > > > The proposed patch definitely decreases the chance of races, but it is not fixing them. > > > > There is a chance to have change in qp state immediately after your "if ..." check. > > > > > > Hello Leon, > > > > > > Please have a look at rxe_qp_error() and you will see that the patch I posted > > > is a proper fix. In the scenario you described rxe_qp_error() will trigger a > > > run of rxe_completer(). > > > > Bart, > > > > What am I missing? > > > > CPU1 CPU2 > > if (unlikely.... > > <--- > > /* move the qp to the error state */ > > void rxe_qp_error(struct rxe_qp *qp) > > { > > qp->req.state = QP_STATE_ERROR; > > qp->resp.state = QP_STATE_ERROR; > > qp->attr.qp_state = IB_QPS_ERR; > > ---> > > rxe_run_task(&qp->req.task, must_sched); > > > > > > > > It is more or less the same as without "if (unlikely..." > > Hello Leon, > > In the above the part of rxe_qp_error() that I was referring to in my e-mail > is missing: > > if (qp_type(qp) == IB_QPT_RC) > rxe_run_task(&qp->comp.task, 1); But it is exactly where race exists, as long QP isn't protected, it can switch CPUs and create race. Thanks > > Bart.
Attachment:
signature.asc
Description: PGP signature