On Wed, Dec 09, 2009 at 02:07:53PM +0800, Li Zefan wrote: > Ben Blum wrote: > > On Tue, Dec 08, 2009 at 03:38:43PM +0800, Li Zefan wrote: > >>> @@ -1291,6 +1324,7 @@ static int cgroup_get_sb(struct file_system_type *fs_type, > >>> struct cgroupfs_root *new_root; > >>> > >>> /* First find the desired set of subsystems */ > >>> + down_read(&subsys_mutex); > >> Hmm.. this can lead to deadlock. sget() returns success with sb->s_umount > >> held, so here we have: > >> > >> down_read(&subsys_mutex); > >> > >> down_write(&sb->s_umount); > >> > >> On the other hand, sb->s_umount is held before calling kill_sb(), > >> so when umounting we have: > >> > >> down_write(&sb->s_umount); > >> > >> down_read(&subsys_mutex); > > > > Unless I'm gravely mistaken, you can't have deadlock on an rwsem when > > it's being taken for reading in both cases? You would have to have at > > least one of the cases being down_write. > > > > lockdep will warn on this.. Hm. Why did I not see this warning...? > And it can really lead to deadlock, though not so obivously: > > thread 1 thread 2 thread 3 > ------------------------------------------- > | read(A) write(B) > | > | write(A) > | > | read(A) > | > | write(B) > | > > t3 is waiting for t1 to release the lock, then t2 tries to > acquire A lock to read, but it has to wait because of t3, > and t1 has to wait t2. > > Note: a read lock has to wait if a write lock is already > waiting for the lock. Okay, clever, the deadlock happens because of a behavioural optimization of the rwsems. Good catch on the whole issue. How does this sound as a possible solution, in cgroup_get_sb: 1) Take subsys_mutex 2) Call parse_cgroupfs_options() 3) Drop subsys_mutex 4) Call sget(), which gets sb->s_umount without subsys_mutex held 5) Take subsys_mutex 6) Call verify_cgroupfs_options() 7) Proceed as normal In which verify_cgroupfs_options will be a new function that ensures the invariants that rebind_subsystems expects are still there; if not, bail out by jumping to drop_new_super just as if parse_cgroupfs_options had failed in the first place. Another question: What's the justification for having an interface of seemingly symmetrical "initialize" and "destroy" functions, one of which has to take a lock and the other gets called with the lock already held? Seems like it's asking for trouble. -- bblum _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers