On Tue 15-02-11 13:04:35, Ted Ts'o wrote: > On Tue, Feb 15, 2011 at 06:29:54PM +0100, Jan Kara wrote: > > Sadly this does not quite work because even down_read(&sb->s_umount) > > in thaw_super() can block if there is another process that tries to acquire > > s_umount for writing - a situation like: > > TASK 1 (e.g. flusher) TASK 2 (e.g. remount) TASK 3 (unfreeze) > > down_read(&sb->s_umount) > > block on s_frozen > > down_write(&sb->s_umount) > > -blocked > > down_read(&sb->s_umount) > > -blocked > > behind the write access... > > OK, sorry for being dense, but why does this cause a deadlock? What > are you imaging TASK 3 doing that would impede the flusher from > eventually resuming? Or how would TASK 3 prevent userspace from > completing whatever it needs to do (say, a device mapper ioctl)? I was arguing that using down_read(sb->s_umount) in thaw_super() instead of down_write() does not solve anything. The deadlock as originally reported can still happen, you just need another task (TASK 2 in the above scheme) to block in down_write() before thaw_super() happens. > freeze_fs has always been inherently dangerous if the userspace does > not know what it's doing. If it freezes the root file system, and > then while the file system is frozen, userspace attempts to modify > /etc/mtab, it's going to lose. I've in the past argued for some kind > of safety timeout that prevents the system from wedging, but the > argument I've gotten back is (a) it's too complex, and (b) userspace > programmers aren't that stupid, and (c) it could cause the filesystem > to unfreeze when userspace wasn't expecting it. Oh, and (d) if the > system wedges up due to userspace being stupid, it's acceptable. > > Obviously, if the kernel does something to itself that causes a > deadlock, we need to fix it, but userspace doing something stupid has > been explicitly ruled out of scope, at least in previous > discussions... > > > And in particular ext4 has another deadlock of this kind because it does > > IO from ext4_remount() e.g. when doing online resize (I know it's a bit > > artifical but still ;). > > OK, I'm being dense again. How does remount and online resize relate > with each other? and it's not I/O in general which is a problem, it's > writeback activity which causes a problem because it takes a read lock > on s_umount, right? The problem is to start a transaction while holding s_umount semaphore, or actually any lock that thaw_super() (including per-filesystem ->unfreeze_fs() callback) needs. For ext4 this seems to be sb->s_lock. I was actually wrong with the ext4 online resizing using resize option causing possible deadlocks because do_remount_sb() refuses to do anything with the superblock while it is frozen... But still if we ever happen to start a transaction in ext4 while sb->s_lock is held, the deadlock with freezing code can happen and that's just subtle and ugly IMHO. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html