Re: [PATCH] sysfs: driver core: Fix glue dir race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014/11/7 1:22, Greg KH wrote:
> On Thu, Nov 06, 2014 at 11:55:47AM -0500, Tejun Heo wrote:
>> Maybe "fix glue dir race condition by not removing them" is a better
>> title?
>>
>> On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote:
>>> There is a race condition when removing glue directory.
>>> It can be reproduced in following test:
>>>
>>> path 1: Add first child device
>>> device_add()
>>> 	get_device_parent()
>>> 		/*find parent from glue_dirs.list*/
>>> 		list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
>>> 			if (k->parent == parent_kobj) {
>>> 				kobj = kobject_get(k);
>>> 				break;
>>> 			}
>>> 		....
>>> 		class_dir_create_and_add()
>>>
>>> path2: Remove last child device under glue dir
>>> device_del()
>>> 	cleanup_device_parent()
>>> 		cleanup_glue_dir()
>>> 			kobject_put(glue_dir);
>>>
>>> If path2 has been called cleanup_glue_dir(), but not
>>> call kobject_put(glue_dir), the glue dir is still
>>> in parent's kset list. Meanwhile, path1 find the glue
>>> dir from the glue_dirs.list. Path2 may release glue dir
>>> before path1 call kobject_get(). So kernel will report
>>> the warning and bug_on.
>>>
>>> This fix keep glue dir around once it created suggested
>>> by Tejun Heo.
>>
>> I think you prolly want to explain why this is okay / desired.
>> e.g. list how the glue dir is used and how many of them are there and
>> explain that there's no real benefit in removing them.
> 
> I'd really _like_ to remove them if at all possible, as if there isn't
> any "children" in the subdirectory, there shouldn't be a need for that
> directory to be there.
> 
> This seems to be the "classic" problem we have of a kref in a list that
> can be found while the last instance could be removed at the same time.
> I hate to just throw another lock at the problem, but wouldn't a lock to
> protect the list of glue_dirs be the answer here?

Hi Greg, in this case, we need to protect the race condition between traverse dev->class->p->glue_dirs.list
and kobject_put(glue_dir) in cleanup_glue_dir().

glue_dirs.list_lock only used to protect glue_dirs.list, but what we want to protect is
don't call kobject_put(glue_dir) to decrease glue_dir ref count during we traverse
dev->class->p->glue_dirs.list.


---------------------------------------------------------------------------
		/* find our class-directory at the parent and reference it */
		spin_lock(&dev->class->p->glue_dirs.list_lock);
		list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)     ------>A
			if (k->parent == parent_kobj) {
				kobj = kobject_get(k);
				break;
			}
		spin_unlock(&dev->class->p->glue_dirs.list_lock);
------------------------------------------------------------------------------
static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
{
	/* see if we live in a "glue" directory */
	if (!glue_dir || !dev->class ||
	    glue_dir->kset != &dev->class->p->glue_dirs)
		return;

	kobject_put(glue_dir);   --------------->B
}
------------------------------------------------------------------------------


Tejun introduced a mutex gdp_mutex in commit 77d3d7c1d561f49 to fix the race condition in get_device_parent().
We could reuse the mutex to fix the race condition between glue_dirs.list traverse and kobject_put(glue_dir).

Greg, the two solutions (reuse the gdp_mutex and don't remove glue_dir), which one do you prefer ?


diff --git a/drivers/base/core.c b/drivers/base/core.c
index 28b808c..645eacf 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -724,12 +724,12 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj)
 	return &dir->kobj;
 }

+static DEFINE_MUTEX(gdp_mutex);

 static struct kobject *get_device_parent(struct device *dev,
 					 struct device *parent)
 {
 	if (dev->class) {
-		static DEFINE_MUTEX(gdp_mutex);
 		struct kobject *kobj = NULL;
 		struct kobject *parent_kobj;
 		struct kobject *k;
@@ -793,7 +793,9 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir)
 	    glue_dir->kset != &dev->class->p->glue_dirs)
 		return;

+	mutex_lock(&gdp_mutex);
 	kobject_put(glue_dir);
+	mutex_unlock(&gdp_mutex);
 }

 static void cleanup_device_parent(struct device *dev)









> 
> thanks,
> 
> greg k-h
> 
> .
> 


-- 
Thanks!
Yijing

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]