Re: [for-next PATCH v2 1/2] RDMA/rxe: Remove unnecessary mr testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 24, 2022 at 2:05 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote:
>
> On 10/21/22 20:09, Li Zhijian wrote:
> >
> >
> > On 21/10/2022 22:39, Zhu Yanjun wrote:
> >> On Fri, Oct 21, 2022 at 3:53 PM Li Zhijian <lizhijian@xxxxxxxxxxx> wrote:
> >>> Before the testing, we already passed it to rxe_mr_copy() where mr could
> >>> be dereferenced. so this checking is not exactly correct.
> >>>
> >>> I tried to figure out the details how/when mr could be NULL, but failed
> >>> at last. Add a WARN_ON(!mr) to that path to tell us more when it
> >>> happends.
> >> If I get you correctly, you confronted a problem,
> > Not exactly,  I removed the mr checking since i think this checking is not correct.
> > the newly added WARN_ON(!mr) is the only once place where the mr can be NULL but not handled correctly.
> > At least with/without this patch, once WARN_ON(!mr) is triggered, kernel will go something wrong.
> >
> > so i want to place this  WARN_ON(!mr) to point to the problem.
> >
> > Thanks
> > Zhijian
> >
> >>   but you can not figure it out.
> >> So you send it upstream as a patch?
> >>
> >> I am not sure if it is a good idea.
> >>
> >> Zhu Yanjun
> >>
> >>> Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx>
> >>> ---
> >>>   drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
> >>>   1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
> >>> index ed5a09e86417..218c14fb07c6 100644
> >>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> >>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> >>> @@ -778,6 +778,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
> >>>          if (res->state == rdatm_res_state_new) {
> >>>                  if (!res->replay) {
> >>>                          mr = qp->resp.mr;
> >>> +                       WARN_ON(!mr);
> >>>                          qp->resp.mr = NULL;
> >>>                  } else {
> >>>                          mr = rxe_recheck_mr(qp, res->read.rkey);
> >>> @@ -811,8 +812,7 @@ static enum resp_states read_reply(struct rxe_qp *qp,
> >>>
> >>>          rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
> >>>                      payload, RXE_FROM_MR_OBJ);
> >>> -       if (mr)
> >>> -               rxe_put(mr);
> >>> +       rxe_put(mr);
> >>>
> >>>          if (bth_pad(&ack_pkt)) {
> >>>                  u8 *pad = payload_addr(&ack_pkt) + payload
> >>> --
> >>> 2.31.1
> >>>
> >
>
> Li is correct that the only way mr could be NULL is if qp->resp.mr == NULL. So the

What I am concerned about is if "WARN_ON(!mr);" should be added or not.
IMO, if the root cause remains unclear, this should be a problem.
Currently this problem is not fixed. It is useless to send a debug
statement to the maillist.

Zhu Yanjun

> 'if (mr)' is not needed if that is the case. The read_reply subroutine is reached
> from a new rdma read operation after going through check_rkey or from a previous
> rdma read operations from get_req if qp->resp.res != NULL or from a duplicate request
> where the previous responder resource is found. In all these cases the mr is set.
> Initially in check_rkey where if it can't find the mr it causes an RKEY_VIOLATION.
> Thereafter the rkey is stored in the responder resources and looked up for each
> packet to get an mr or cause an RKEY_VIOLATION. So the mr can't be NULL. I think
> you can leave out the WARN and just drop the if (mr).
>
> Bob
>



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux