Re: [PATCH] mm/dmapool.c: Do not create/destroy sysfs file while holding pools_lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:

> On Mon, 28 Feb 2011 16:41:24 -0600
> Russ Meyerriecks <rmeyerriecks@xxxxxxxxxx> wrote:
>
>> From: Shaun Ruffell <sruffell@xxxxxxxxxx>
>> 
>> Eliminates a circular lock dependency reported by lockdep. When reading the
>> "pools" file from a PCI device via sysfs, the s_active lock is acquired before
>> the pools_lock. When unloading the driver and destroying the pool, pools_lock
>> is acquired before the s_active lock.
>> 
>>  cat/12016 is trying to acquire lock:
>>   (pools_lock){+.+.+.}, at: [<c04ef113>] show_pools+0x43/0x140
>> 
>>  but task is already holding lock:
>>   (s_active#82){++++.+}, at: [<c0554e1b>] sysfs_read_file+0xab/0x160
>> 
>>  which lock already depends on the new lock.
>
> sysfs_dirent_init_lockdep() and the 6992f53349 ("sysfs: Use one lockdep
> class per sysfs attribute") which added it are rather scary.
>
> The alleged bug appears to be due to taking pools_lock outside
> device_create_file() (which takes magical sysfs PseudoVirtualLocks)
> versus show_pools(), which takes pools_lock but is called from inside
> magical sysfs PseudoVirtualLocks.
>
> I don't know if this is actually a real bug or not.  Probably not, as
> this device_create_file() does not match the reasons for 6992f53349:
> "There is a sysfs idiom where writing to one sysfs file causes the
> addition or removal of other sysfs files".  But that's a guess.

device_create_file is arguable But this also happens with
device_remove_file, and that is exactly the deadlock scenario I added
the lockdep annotation to catch.  So the patch clearly does not fix the
issue.

Eric


>> --- a/mm/dmapool.c
>> +++ b/mm/dmapool.c
>> @@ -174,21 +174,28 @@ struct dma_pool *dma_pool_create(const char *name, struct device *dev,
>>  	init_waitqueue_head(&retval->waitq);
>>  
>>  	if (dev) {
>> -		int ret;
>> +		int first_pool;
>>  
>>  		mutex_lock(&pools_lock);
>>  		if (list_empty(&dev->dma_pools))
>> -			ret = device_create_file(dev, &dev_attr_pools);
>> +			first_pool = 1;
>>  		else
>> -			ret = 0;
>> +			first_pool = 0;
>>  		/* note:  not currently insisting "name" be unique */
>> -		if (!ret)
>> -			list_add(&retval->pools, &dev->dma_pools);
>> -		else {
>> -			kfree(retval);
>> -			retval = NULL;
>> -		}
>> +		list_add(&retval->pools, &dev->dma_pools);
>>  		mutex_unlock(&pools_lock);
>> +
>> +		if (first_pool) {
>> +			int ret;
>> +			ret = device_create_file(dev, &dev_attr_pools);
>> +			if (ret) {
>> +				mutex_lock(&pools_lock);
>> +				list_del(&retval->pools);
>> +				mutex_unlock(&pools_lock);
>> +				kfree(retval);
>> +				retval = NULL;
>> +			}
>> +		}
>
> Not a good fix, IMO.  The problem is that if two CPUs concurrently call
> dma_pool_create(), the first CPU will spend time creating the sysfs
> file.  Meanwhile, the second CPU will whizz straight back to its
> caller.  The caller now thinks that the sysfs file has been created and
> returns to userspace, which immediately tries to read the sysfs file. 
> But the first CPU hasn't finished creating it yet.  Userspace fails.
>
> One way of fixing this would be to create another singleton lock:
>
>
> 	{
> 		static DEFINE_MUTEX(pools_sysfs_lock);
> 		static bool pools_sysfs_done;
>
> 		mutex_lock(&pools_sysfs_lock);
> 		if (pools_sysfs_done == false) {
> 			create_sysfs_stuff();
> 			pools_sysfs_done = true;
> 		}
> 		mutex_unlock(&pools_sysfs_lock);
> 	}
>
> That's not terribly pretty.

Or possibly use module_init style magic.  Where use module
initialization and remove to trigger creation and deletion of the sysfs.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]