Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes: > On Mon, 28 Feb 2011 16:41:24 -0600 > Russ Meyerriecks <rmeyerriecks@xxxxxxxxxx> wrote: > >> From: Shaun Ruffell <sruffell@xxxxxxxxxx> >> >> Eliminates a circular lock dependency reported by lockdep. When reading the >> "pools" file from a PCI device via sysfs, the s_active lock is acquired before >> the pools_lock. When unloading the driver and destroying the pool, pools_lock >> is acquired before the s_active lock. >> >> cat/12016 is trying to acquire lock: >> (pools_lock){+.+.+.}, at: [<c04ef113>] show_pools+0x43/0x140 >> >> but task is already holding lock: >> (s_active#82){++++.+}, at: [<c0554e1b>] sysfs_read_file+0xab/0x160 >> >> which lock already depends on the new lock. > > sysfs_dirent_init_lockdep() and the 6992f53349 ("sysfs: Use one lockdep > class per sysfs attribute") which added it are rather scary. > > The alleged bug appears to be due to taking pools_lock outside > device_create_file() (which takes magical sysfs PseudoVirtualLocks) > versus show_pools(), which takes pools_lock but is called from inside > magical sysfs PseudoVirtualLocks. > > I don't know if this is actually a real bug or not. Probably not, as > this device_create_file() does not match the reasons for 6992f53349: > "There is a sysfs idiom where writing to one sysfs file causes the > addition or removal of other sysfs files". But that's a guess. device_create_file is arguable But this also happens with device_remove_file, and that is exactly the deadlock scenario I added the lockdep annotation to catch. So the patch clearly does not fix the issue. Eric >> --- a/mm/dmapool.c >> +++ b/mm/dmapool.c >> @@ -174,21 +174,28 @@ struct dma_pool *dma_pool_create(const char *name, struct device *dev, >> init_waitqueue_head(&retval->waitq); >> >> if (dev) { >> - int ret; >> + int first_pool; >> >> mutex_lock(&pools_lock); >> if (list_empty(&dev->dma_pools)) >> - ret = device_create_file(dev, &dev_attr_pools); >> + first_pool = 1; >> else >> - ret = 0; >> + first_pool = 0; >> /* note: not currently insisting "name" be unique */ >> - if (!ret) >> - list_add(&retval->pools, &dev->dma_pools); >> - else { >> - kfree(retval); >> - retval = NULL; >> - } >> + list_add(&retval->pools, &dev->dma_pools); >> mutex_unlock(&pools_lock); >> + >> + if (first_pool) { >> + int ret; >> + ret = device_create_file(dev, &dev_attr_pools); >> + if (ret) { >> + mutex_lock(&pools_lock); >> + list_del(&retval->pools); >> + mutex_unlock(&pools_lock); >> + kfree(retval); >> + retval = NULL; >> + } >> + } > > Not a good fix, IMO. The problem is that if two CPUs concurrently call > dma_pool_create(), the first CPU will spend time creating the sysfs > file. Meanwhile, the second CPU will whizz straight back to its > caller. The caller now thinks that the sysfs file has been created and > returns to userspace, which immediately tries to read the sysfs file. > But the first CPU hasn't finished creating it yet. Userspace fails. > > One way of fixing this would be to create another singleton lock: > > > { > static DEFINE_MUTEX(pools_sysfs_lock); > static bool pools_sysfs_done; > > mutex_lock(&pools_sysfs_lock); > if (pools_sysfs_done == false) { > create_sysfs_stuff(); > pools_sysfs_done = true; > } > mutex_unlock(&pools_sysfs_lock); > } > > That's not terribly pretty. Or possibly use module_init style magic. Where use module initialization and remove to trigger creation and deletion of the sysfs. Eric -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>