> -----Original Message----- > From: Jason Gunthorpe [mailto:jgg@xxxxxxxx] > Sent: Thursday, May 17, 2018 11:59 AM > To: Parav Pandit <parav@xxxxxxxxxxxx> > Cc: Leon Romanovsky <leon@xxxxxxxxxx>; Doug Ledford > <dledford@xxxxxxxxxx>; Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA > mailing list <linux-rdma@xxxxxxxxxxxxxxx> > Subject: Re: [PATCH rdma-next 05/12] IB/core: Introduce GID entry reference > counts for RoCE > > On Thu, May 17, 2018 at 04:51:48PM +0000, Parav Pandit wrote: > > > > > > > From: Jason Gunthorpe [mailto:jgg@xxxxxxxx] > > > Sent: Thursday, May 17, 2018 11:43 AM > > > To: Parav Pandit <parav@xxxxxxxxxxxx> > > > Cc: Leon Romanovsky <leon@xxxxxxxxxx>; Doug Ledford > > > <dledford@xxxxxxxxxx>; Leon Romanovsky <leonro@xxxxxxxxxxxx>; > RDMA > > > mailing list <linux-rdma@xxxxxxxxxxxxxxx> > > > Subject: Re: [PATCH rdma-next 05/12] IB/core: Introduce GID entry > > > reference counts for RoCE > > > > > > On Thu, May 17, 2018 at 04:39:38PM +0000, Parav Pandit wrote: > > > > > > @@ -1297,6 +1355,12 @@ void ib_cache_cleanup_one(struct > > > > > > ib_device > > > > > *device) > > > > > > ib_unregister_event_handler(&device->cache.event_handler); > > > > > > flush_workqueue(ib_wq); > > > > > > gid_table_cleanup_one(device); > > > > > > + > > > > > > + /* > > > > > > + * Flush the wq second time for any pending GID delete > > > > > > + * work for RoCE device. > > > > > > + */ > > > > > > + flush_workqueue(ib_wq); > > > > > > } > > > > > > > > > > Humm.. Either this flush is not needed or it isn't strong enough. > > > > > > > > > > I think we need to wait until the gid table is fully free'd and > > > > > all PENDING_DELS are completed ? > > > > > > > > > This second flush waits for gid table being fully freed. > > > > > > No, it only waits for frees in progress to complete, it doesn't wait > > > for anyone out there holding a kref to drop the ref. > > > > > That is inline with how any stale access to ibdevice can be present. > > It is stack wide behavior to all callers to drop respective data > > structure reference before it can be released. Callers should continue > > to follow client api to drop the reference to device by destroying > > their respective data structures such as cm_id etc when > > remove() is called. Without that more crashes can happen even today > > in unrelated areas to gid entry. So it is not the problem that kref > > of the gid entry should be solving, if at all it exist. > > It is worth a warn on to check that the table is truely empty though, to catch bad > ULPs. > Yep. Adding WARN_ON in release_gid_table(). > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html