Re: [PATCH RFC 03/12] RDMA/rtrs-srv: Only close srv_path if it is just allocated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/15/22 12:18 AM, Haris Iqbal wrote:
On Sun, Nov 13, 2022 at 2:08 AM Guoqing Jiang<guoqing.jiang@xxxxxxxxx>  wrote:
RTRS creates several connections per nr_cpu_ids, it makes more sense
to only close the path when it just allocated.

Signed-off-by: Guoqing Jiang<guoqing.jiang@xxxxxxxxx>
---
  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 2cc8b423bcaa..063082d29fc6 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1833,6 +1833,7 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
         u16 version, con_num, cid;
         u16 recon_cnt;
         int err = -ECONNRESET;
+       bool alloc_path = false;

         if (len < sizeof(*msg)) {
                 pr_err("Invalid RTRS connection request\n");
@@ -1906,6 +1907,7 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
                         pr_err("RTRS server session allocation failed: %d\n", err);
                         goto reject_w_err;
                 }
+               alloc_path = true;
         }
         err = create_con(srv_path, cm_id, cid);
         if (err) {
@@ -1940,7 +1942,8 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,

  close_and_return_err:
         mutex_unlock(&srv->paths_mutex);
-       close_path(srv_path);
+       if (alloc_path)
+               close_path(srv_path);
I think the way this is coded is, if there is a problem opening a
connection in that srv_path, then it closes that srv_path gracefully,
which results in all other connections in that path getting closed,
and the IOs being failover'ed to the other path (in case of
multipath), and then the client would trigger an auto reconnect.

Could it possible that IO is happening in the previous connection, then a later
failed connection try to close all the connections.

@Jinpu can shed some more light, what would be the preferred course of
action if there is a failure in one connection. To keep the current
design or to avoid closing the path in case of a connection issue.

Frankly, I am less confident about this patch and seems it is a rare condition.
Will just drop it for now.

Thanks,
Guoqing



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux