On 2014/11/7 1:22, Greg KH wrote: > On Thu, Nov 06, 2014 at 11:55:47AM -0500, Tejun Heo wrote: >> Maybe "fix glue dir race condition by not removing them" is a better >> title? >> >> On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote: >>> There is a race condition when removing glue directory. >>> It can be reproduced in following test: >>> >>> path 1: Add first child device >>> device_add() >>> get_device_parent() >>> /*find parent from glue_dirs.list*/ >>> list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) >>> if (k->parent == parent_kobj) { >>> kobj = kobject_get(k); >>> break; >>> } >>> .... >>> class_dir_create_and_add() >>> >>> path2: Remove last child device under glue dir >>> device_del() >>> cleanup_device_parent() >>> cleanup_glue_dir() >>> kobject_put(glue_dir); >>> >>> If path2 has been called cleanup_glue_dir(), but not >>> call kobject_put(glue_dir), the glue dir is still >>> in parent's kset list. Meanwhile, path1 find the glue >>> dir from the glue_dirs.list. Path2 may release glue dir >>> before path1 call kobject_get(). So kernel will report >>> the warning and bug_on. >>> >>> This fix keep glue dir around once it created suggested >>> by Tejun Heo. >> >> I think you prolly want to explain why this is okay / desired. >> e.g. list how the glue dir is used and how many of them are there and >> explain that there's no real benefit in removing them. > > I'd really _like_ to remove them if at all possible, as if there isn't > any "children" in the subdirectory, there shouldn't be a need for that > directory to be there. > > This seems to be the "classic" problem we have of a kref in a list that > can be found while the last instance could be removed at the same time. > I hate to just throw another lock at the problem, but wouldn't a lock to > protect the list of glue_dirs be the answer here? Hi Greg, in this case, we need to protect the race condition between traverse dev->class->p->glue_dirs.list and kobject_put(glue_dir) in cleanup_glue_dir(). glue_dirs.list_lock only used to protect glue_dirs.list, but what we want to protect is don't call kobject_put(glue_dir) to decrease glue_dir ref count during we traverse dev->class->p->glue_dirs.list. --------------------------------------------------------------------------- /* find our class-directory at the parent and reference it */ spin_lock(&dev->class->p->glue_dirs.list_lock); list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) ------>A if (k->parent == parent_kobj) { kobj = kobject_get(k); break; } spin_unlock(&dev->class->p->glue_dirs.list_lock); ------------------------------------------------------------------------------ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir) { /* see if we live in a "glue" directory */ if (!glue_dir || !dev->class || glue_dir->kset != &dev->class->p->glue_dirs) return; kobject_put(glue_dir); --------------->B } ------------------------------------------------------------------------------ Tejun introduced a mutex gdp_mutex in commit 77d3d7c1d561f49 to fix the race condition in get_device_parent(). We could reuse the mutex to fix the race condition between glue_dirs.list traverse and kobject_put(glue_dir). Greg, the two solutions (reuse the gdp_mutex and don't remove glue_dir), which one do you prefer ? diff --git a/drivers/base/core.c b/drivers/base/core.c index 28b808c..645eacf 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -724,12 +724,12 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj) return &dir->kobj; } +static DEFINE_MUTEX(gdp_mutex); static struct kobject *get_device_parent(struct device *dev, struct device *parent) { if (dev->class) { - static DEFINE_MUTEX(gdp_mutex); struct kobject *kobj = NULL; struct kobject *parent_kobj; struct kobject *k; @@ -793,7 +793,9 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir) glue_dir->kset != &dev->class->p->glue_dirs) return; + mutex_lock(&gdp_mutex); kobject_put(glue_dir); + mutex_unlock(&gdp_mutex); } static void cleanup_device_parent(struct device *dev) > > thanks, > > greg k-h > > . > -- Thanks! Yijing -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html