Re: [PATCH] RDMA/rxe: Delete error messages triggered by incoming Read requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 30/08/2022 17:44, Li Zhijian wrote:


On 29/08/2022 18:21, Matsuda, Daisuke/松田 大輔 wrote:
On Monday, August 29, 2022 4:36 PM, Li Zhijian wrote:
On 29/08/2022 13:44, Daisuke Matsuda wrote:
An incoming Read request causes multiple Read responses. If a user MR to
copy data from is unavailable or responder cannot send a reply, then the
error messages can be printed for each response attempt, resulting in
message overflow.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>
---
   drivers/infiniband/sw/rxe/rxe_resp.c | 6 +-----
   1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index b36ec5c4d5e0..4b3e8aec2fb8 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -811,8 +811,6 @@ static enum resp_states read_reply(struct rxe_qp *qp,

       err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
                 payload, RXE_FROM_MR_OBJ);

Looks i was missing the the faulting point inside rxe_mr_copy()

mr_check_range() is the only point to return an error inside rxe_mr_copy()
and mr_check_range() would never fail in this moment, since it is always tested by RESPST_CHK_RKEY
before calling read_reply()

so it's safe to remove this print, and add some comments ?

Thanks
Zhijian



-    if (err)
-        pr_err("Failed copying memory\n");
Not relate to this patch.
I'm wondering why this err is ignored, rxe_mr_copy() does the real execution or rxe_mr_copy() would never fail ?
IMO, when err happens, responder shall notify the request anyhow.
Practically, I have never seen rxe_mr_copy() failed before,
but I agree the implementation may be incorrect as you mentioned.

As far as I tested, responder replied with the requested amount of payloads
even when rxe_mr_copy() is modified to fail. In this case,
requester may mistakenly believe that they get data correctly.

For more details, see IB Specification Vol 1-Revision-1.5 Ch.9.7.5.1.3 (page.334).

it seems it's suitable to reply NAK code "REMOTE ACCESS ERROR" to the requester side
by returning RESPST_ERR_RKEY_VIOLATION here.

see "9.7.5.2.4 REMOTE ACCESS ERROR" and "9.7.4.1.5 RESPONDER R_KEY VALIDATION"




Daisuke Matsuda

Thanks
Zhijian

       if (mr)
           rxe_put(mr);

@@ -823,10 +821,8 @@ static enum resp_states read_reply(struct rxe_qp *qp,
       }

       err = rxe_xmit_packet(qp, &ack_pkt, skb);
-    if (err) {
-        pr_err("Failed sending RDMA reply.\n");
+    if (err)
           return RESPST_ERR_RNR;
-    }

       res->read.va += payload;
       res->read.resid -= payload;





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux