> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, April 7, 2021 8:44 PM > > On Wed, Apr 07, 2021 at 03:06:35PM +0000, Parav Pandit wrote: > > > > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Tuesday, April 6, 2021 9:17 PM > > > > > > On Mon, Apr 05, 2021 at 08:49:56AM +0300, Leon Romanovsky wrote: > > > > @@ -2293,6 +2295,17 @@ static void ib_sa_event(struct > > > > ib_event_handler > > > *handler, > > > > } > > > > } > > > > > > > > +static bool ib_sa_client_supported(struct ib_device *device) { > > > > + unsigned int i; > > > > + > > > > + rdma_for_each_port(device, i) { > > > > + if (rdma_cap_ib_sa(device, i)) > > > > + return true; > > > > + } > > > > + return false; > > > > +} > > > > > > This is already done though: > > > It is but, ib_sa_device() allocates ib_sa_device worth of struct for > > each port without checking the rdma_cap_ib_sa(). This results into > > allocating 40 * 512 = 20480 rounded of to power of 2 to 32K bytes of > > memory for the rdma device with 512 ports. Other modules are also > > similarly wasting such memory. > > If it returns EOPNOTUPP then the remove is never called so if it allocated > memory and left it allocated then it is leaking memory. > I probably confused you. There is no leak today because add_one allocates memory, and later on when SA/CM etc per port cap is not present, it is unused left there which is freed on remove_one(). Returning EOPNOTUPP is fine at start of add_one() before allocation. > If you are saying 32k bytes of temporary allocation matters during device > startup then it needs benchmarks and a use case. > Use case is clear and explained in commit logs, i.e. to not allocate the memory which is never used. > > > The add_one function should return -EOPNOTSUPP if it doesn't want to > > > run on this device and any supported checks should just be at the > > > front - this is how things work right now > > > I am ok to fold this check at the beginning of add callback. When > > 512 to 1K RoCE devices are used, they do not have SA, CM, CMA etc caps > > on and all the client needs to go through refcnt + xa + sem and unroll > > them. Is_supported() routine helps to cut down all of it. I didn't > > calculate the usec saved with it. > > If that is the reason then explain in the cover letter and provide benchmarks I doubt it will be significant but I will do a benchmark.