On Fri, 2024-02-23 at 10:11 +0100, Herve Codina wrote: > Hi Saravana, > > On Tue, 20 Feb 2024 16:31:13 -0800 > Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > ... > > > > +void device_link_wait_removal(void) > > > +{ > > > + /* > > > + * devlink removal jobs are queued in the dedicated work queue. > > > + * To be sure that all removal jobs are terminated, ensure that > > > any > > > + * scheduled work has run to completion. > > > + */ > > > + drain_workqueue(fw_devlink_wq); > > > > Is there a reason this needs to be drain_workqueu() instead of > > flush_workqueue(). Drain is a stronger guarantee than we need in this > > case. All we are trying to make sure is that all the device link > > remove work queued so far have completed. > > I used drain_workqueue() because drain_workqueue() allows for jobs already > present in a workqueue to re-queue a job and drain_workqueue() will wait > also for this new job completion. > > I think flush_workqueue() doesn't wait for this chain queueing. > > In our case, my understanding was that device_link_release_fn() calls > put_device() for the consumer and the supplier. > If refcounts reaches zero, devlink_dev_release() can be called again > and re-queue a job. > Looks sensible. The only doubt (that Saravana mays know better) is that I'm not sure put_device() on a supplier or consumer can actually lead to devlink_dev_release(). AFAIU, a consumer or a supplier should not be a device from the devlink class. Hence, looking at device_release(), I'm not sure it can happen unless for some odd reason someone is messing with devlinks in .remove() or .type->remove(). - Nuno Sá