RE: iwarp kernel mode applications are broken with commit f35faa4ba

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Shiraz, Raju,

> -----Original Message-----
> From: Shiraz Saleem [mailto:shiraz.saleem@xxxxxxxxx]
> Sent: Thursday, April 26, 2018 8:48 PM
> To: Raju Rangoju <rajur@xxxxxxxxxxx>
> Cc: Parav Pandit <parav@xxxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; SWise
> OGC <swise@xxxxxxxxxxxxxxxxxxxxx>; sean.hefty@xxxxxxxxx
> Subject: Re: iwarp kernel mode applications are broken with commit f35faa4ba
> 
> On Thu, Apr 26, 2018 at 07:46:38PM +0000, Raju  Rangoju wrote:
> > Hi Parav,
> >
> > The following commit f35faa4ba broke iWARP kernel mode applications.
> >
> > commit f35faa4ba9568138eea1c58abb92e8ef415dce41
> > Author: Parav Pandit <parav@xxxxxxxxxxxx>
> > Date:   Sun Apr 1 15:08:20 2018 +0300
> >
> >     IB/core: Simplify ib_query_gid to always refer to cache
> >
> > [root@bhumthang]# nvme discover -t rdma -a 102.1.1.17 Failed to write
> > to /dev/nvme-fabrics: Invalid argument
> >
> > [root@bhumthang]# dmesg
> > [55961.151787] nvme nvme0: rdma_connect failed (-22).
> > [55961.151971] nvme nvme0: rdma connection establishment failed (-22)
> >
> > ------------
> > iser
> > -------------
> > [54714.834984] iw_cxgb4: Chelsio T4/T5 RDMA Driver - version 0.1
> > [54714.834987] iw_cxgb4: 0000:04:00.4: Up [54714.834987] iw_cxgb4:
> > 0000:04:00.4: On-Chip Queues not supported on this device
> > [54714.855963] ib_srpt MAD registration failed for cxgb4_0-1.
> > [54714.855972] ib_srpt srpt_add_one(cxgb4_0) failed.
> > [54715.123119] iw_cxgb4: 0000:07:00.4: Up [54715.123121] iw_cxgb4:
> > 0000:07:00.4: On-Chip Queues not supported on this device
> > [54715.125977] cxgb4 0000:07:00.4 enp7s0f4: port module unplugged
> > [54715.166076] ib_srpt MAD registration failed for cxgb4_1-1.
> > [54715.166080] ib_srpt srpt_add_one(cxgb4_1) failed.
> >  [54834.322675] iser: iser_route_handler: failure connecting: -22
> > [54835.326918] iser: iser_route_handler: failure connecting: -22
> > [54836.331221] iser: iser_route_handler: failure connecting: -22
> > [54837.335625] iser: iser_route_handler: failure connecting: -22
> > [54838.339980] iser: iser_route_handler: failure connecting: -22
> > [54839.343882] iser: iser_route_handler: failure connecting: -22
> >
> 
> 
> My validation team reported the same issue on i40iw with 4.17-rc kernels.
> 
> Some more data. Looks like the failure is because we can't find the cached gid
> due to the gid idx being wrong in query_gid.
> 
> rdma_connect
>    cma_connect_iw
> 	cma_modify_qp_rtr
> 	    ib_query_gid
> 		ib_get_cached_gid
> 		    __ib_cache_gid_get (EINVAL)
> 
This call trace is helpful.
Can you please run ibv_devinfo -v | grep GID and see that you are getting the expected GID.
For iWarp we have only one entry GID table. So want to make sure that GID table is build correctly.

>From the above call trace, is appears that,
cma_modify_qp_rtr() contains,
struct ib_qp_attr qp_attr;

ah_attr from above qp_attr remains uninitialized by iw_cm_init_qp_attr() and iwcm_init_qp_init_attr().
Before my fix, gid_index was always ignored by the query_gid() callback such as i40iw_query_gid() and c4iw_query_gid().
So it used to work.
Now my fix expects all values to be correct; due to uninitialized ib_qp_attr it is likely failing.

So can you please try below hunk and see if ib_query_gid() progresses for you?
If it works, I will send the proper patch shortly.

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8512f63..e119cff 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -863,7 +863,7 @@ void rdma_destroy_qp(struct rdma_cm_id *id)
 static int cma_modify_qp_rtr(struct rdma_id_private *id_priv,
                             struct rdma_conn_param *conn_param)
 {
-       struct ib_qp_attr qp_attr;
+       struct ib_qp_attr qp_attr = {};
        int qp_attr_mask, ret;
        union ib_gid sgid;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux