On Tue, Jan 03, 2012 at 02:39:42PM +0100, Jan Kara wrote: > Thanks Stephen! Al, how shall we resolve this? You wrote you can provide > a VFS helper like get_super() which will also guarantee that the fs is > unfrozen. That could be used in quotactl_block() and fsync_bdev(). If you > plan to do this for 3.3 then I can just remove the quota fix and let you > do it. I started digging in that area and I really don't like what I'm seeing. sget() race fix from Aug 2010 (MS_BORN one) had not covered all cases. The thing is, we can get hit with this: 1) mount(2) does sget(), etc. and fails very late in the game - with ->s_root already allocated. For some filesystems such failure exits are possible. 2) something crawling through the superblock list finds our new sb before we realize it's doomed. Tries to grab s_umount, gets blocked. 3) in the meanwhile *another* mount(2) does sget() that catches the same sb and decides to pick it. ->s_active is grabbed, we get blocked on attempt to get ->s_umount exclusive. 4) the original mount(2) gets to the failure point and does deactivate_locked_super(). ->s_active is decremented, ->s_umount unlocked. However, because of (3) ->s_active does not reach 0 yet. Guy stuck in (2) gets to run. ->s_root is non-NULL here. And fs is not in a good shape... 5) sget() from (3) gets to ->s_umount, notices that MS_BORN hadn't been set and does deactivate_locked_super(). Now ->s_active is 0 and we get around to shutting the sucker down. ->kill_sb() gets called, ->s_root is dropped, etc. - the whole nine yards. Caller of sget() had been saved from the race. However, whoever that had been in (2) and (4) still got hit. IOW, MS_BORN check is needed in the places that go through the superblock list, grab ->s_umount and check ->s_root. That will close the hole for good. We also have a problem in get_active_super() caller; again, the missing MS_BORN check (in freeze_super(), after getting ->s_umount). I went through the ->mount() instances; most of them can't fail with non-NULL ->s_root at all or, if they do, leave the superblock in basically usable shape. However, some might be b0rken; among other things, ext4 and minixfs *definitely* can leak root dentry on late failure exits. Still doing RTFS... Another fun question - can ->statfs() ever wait for fs to be thawed? If so, we have another problem like the one spotted by Mikulas - in ustat(2). And if not, we'd damn better document that requirement. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html