On Wed, Dec 13, 2017 at 01:05:49PM +0000, Chris Wilson wrote: > Quoting Daniel Vetter (2017-12-13 12:49:36) > > PROBE_DEFER also uses system_wq to reprobe drivers, which means when > > that again fails, and we try to flush the overall system_wq (to get > > all the delayed connectore cleanup work_struct completed), we > > deadlock. > > > > Fix this by using just a single cleanup work, so that we can only > > flush that one and don't block on anything else. That means a free > > list plus locking, a standard pattern. > > > > v2: > > - Correctly free connectors only on last ref. Oops (Chris). > > - use llist_head/node (Chris). > > > > Fixes: a703c55004e1 ("drm: safely free connectors from connector_iter") > > Fixes: 613051dac40d ("drm: locking&new iterators for connector_list") > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > > Cc: Dave Airlie <airlied@xxxxxxxxx> > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Sean Paul <seanpaul@xxxxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> # v4.11+: 613051dac40d ("drm: locking&new iterators for connector_list" > > Cc: <stable@xxxxxxxxxxxxxxx> # v4.11+ > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxxx> > > Cc: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > > Cc: Gustavo Padovan <gustavo@xxxxxxxxxxx> > > Cc: David Airlie <airlied@xxxxxxxx> > > Cc: Javier Martinez Canillas <javier@xxxxxxxxxxxx> > > Cc: Shuah Khan <shuahkh@xxxxxxxxxxxxxxx> > > Cc: Guillaume Tucker <guillaume.tucker@xxxxxxxxxxxxx> > > Cc: Mark Brown <broonie@xxxxxxxxxx> > > Cc: Kevin Hilman <khilman@xxxxxxxxxxxx> > > Cc: Matt Hart <matthew.hart@xxxxxxxxxx> > > Cc: Thierry Escande <thierry.escande@xxxxxxxxxxxxxxx> > > Cc: Tomeu Vizoso <tomeu.vizoso@xxxxxxxxxxxxx> > > Cc: Enric Balletbo i Serra <enric.balletbo@xxxxxxxxxxxxx> > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > > --- > > drivers/gpu/drm/drm_connector.c | 50 ++++++++++++++++++++++++++----------- > > drivers/gpu/drm/drm_crtc_internal.h | 1 + > > drivers/gpu/drm/drm_mode_config.c | 4 ++- > > include/drm/drm_connector.h | 10 +++++--- > > include/drm/drm_mode_config.h | 18 ++++++++++++- > > 5 files changed, 62 insertions(+), 21 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c > > index 0b7e0974e6da..3f53f127e1f2 100644 > > --- a/drivers/gpu/drm/drm_connector.c > > +++ b/drivers/gpu/drm/drm_connector.c > > @@ -153,14 +153,23 @@ static void drm_connector_free(struct kref *kref) > > connector->funcs->destroy(connector); > > } > > > > -static void drm_connector_free_work_fn(struct work_struct *work) > > +void drm_connector_free_work_fn(struct work_struct *work) > > { > > - struct drm_connector *connector = > > - container_of(work, struct drm_connector, free_work); > > - struct drm_device *dev = connector->dev; > > + struct drm_connector *connector, *n; > > + struct drm_device *dev = > > + container_of(work, struct drm_device, mode_config.connector_free_work); > > + struct drm_mode_config *config = &dev->mode_config; > > + unsigned long flags; > > + struct llist_node *freed; > > > > - drm_mode_object_unregister(dev, &connector->base); > > - connector->funcs->destroy(connector); > > + spin_lock_irqsave(&config->connector_list_lock, flags); > > + freed = llist_del_all(&config->connector_free_list); > > + spin_unlock_irqrestore(&config->connector_list_lock, flags); > > My understanding is that the spinlock here is only used to guard the > free_list. (It's not protecting the final refcount.) In which case it is > redundant as llist_del_all/llist_add are a safe lockless combination. > > That just makes the patch bigger than has to be, but it looks correct. > > > +__drm_connector_put_safe(struct drm_connector *conn) > > { > > - if (refcount_dec_and_test(&conn->base.refcount.refcount)) > > - schedule_work(&conn->free_work); > > + struct drm_mode_config *config = &conn->dev->mode_config; > > + > > + lockdep_assert_held(&config->connector_list_lock); > > + > > + if (!refcount_dec_and_test(&conn->base.refcount.refcount)) > > + return; > > + > > + llist_add(&conn->free_node, &config->connector_free_list); > > + schedule_work(&config->connector_free_work); > > (Didn't like the if (llist_add) nano-optimisation? :) I thought that one might race, since the schedule_work is what provides the crucial barrier here. But then I was kinda too lazy to read all the llist guarantees already and just figured I'll keep the spin_lock stuck around everything. But yeah it's all neatly lockless, now I'm tempted to redo it all. If CI spots something I'll include it in the respin for sure. > > diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h > > index 9ebb8841778c..af00f42ba269 100644 > > --- a/drivers/gpu/drm/drm_crtc_internal.h > > +++ b/drivers/gpu/drm/drm_crtc_internal.h > > @@ -142,6 +142,7 @@ int drm_mode_connector_set_obj_prop(struct drm_mode_object *obj, > > uint64_t value); > > int drm_connector_create_standard_properties(struct drm_device *dev); > > const char *drm_get_connector_force_name(enum drm_connector_force force); > > +void drm_connector_free_work_fn(struct work_struct *work); > > > > /* IOCTL */ > > int drm_mode_connector_property_set_ioctl(struct drm_device *dev, > > diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c > > index 6ffe952142e6..7681269abe41 100644 > > --- a/drivers/gpu/drm/drm_mode_config.c > > +++ b/drivers/gpu/drm/drm_mode_config.c > > @@ -382,6 +382,8 @@ void drm_mode_config_init(struct drm_device *dev) > > ida_init(&dev->mode_config.connector_ida); > > spin_lock_init(&dev->mode_config.connector_list_lock); > > > > + INIT_WORK(&dev->mode_config.connector_free_work, drm_connector_free_work_fn); > > A init_llist_head(&dev->mode_config.connector_free_list) wouldn't go > amiss here. So perhaps push the connectors init into its own exported > function from drm_connector.c as opposed to exposing the free_fn. Imo it doesn't matter much how we go about drm.ko internals. But I'll stick the init_llist_head in there when applying, somehow I dind't find it (why is every kernel data type slightly different in this). > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch