I've been working on some patches to fix the following problem: File system freeze/thaw require the superblock's s_umount lock. However, we write to the file system while holding the s_umount lock in several places (e.g., writeback and quotas). Any of these can block on a frozen file system while holding s_umount, preventing any later thaw from occurring, since thaw requires s_umount. The solution is to add a check, vfs_is_frozen(), to all code paths that write while holding sb->s_umount and return without writing if it is true. Attached is an audit of each use of ->s_umount and whether it could result in a deadlock with freeze/thaw. Some of this has gone into the code comments. Patches against some random post 3.0.0 pull to follow. A patch to xfstests is coming. Basically, we need to test for mmaped writes concurrent with file system freeze. Probably test 068 can be extended. (If anyone is interested in doing this, it is relatively simple work.) Thanks, Jan Kara and Dave Chinner for their kind suggestions for fixes and review of patches offline. -VAL
Actual places s_umount is grabbed fs/fs-writeback.c - grabbed by writeback_inodes_sb_if_idle() YES: fs is mounted, can be frozen, we may write an inode Called only by ext4 when space is getting low, return value ignored Solution: down_read_trylock() (also writeback_sb_inodes() checks for frozen) - grabbed by writeback_inodes_sb_nr_if_idle() YES: fs is mounted, can be frozen, we may write an inode Called only by btrfs when it is trying to free space, return value ignored Solution: down_read_trylock() (also writeback_sb_inodes() checks for frozen) - grabbed in wb_do_writeback() by grab_super_passive() YES: fs is mounted, can be frozen, we may write an inode Solution: writeback_sb_inodes() checks for frozen and returns Note: there was a livelock in wb_do_writeback() at grab_super_passive() fs/namespace.c - grabbed by do_emergency_remount() YES: fs is mounted, can be frozen, we may write the superblock Solution: already checks for frozen in do_remount_sb() - grabbed by do_remount() YES: fs is mounted, can be frozen, we may write the superblock Solution: already checks for frozen in do_remount_sb() - grabbed by mount_bdev() NO: this is only called on first mount, fs not visible, can't be frozen fs/super.c - grabbed by sync_supers() NO: We only write super block if s->s_dirt is set - grabbed by iterate_supers()/iterate_supers_type() YES: fs is mounted, can be frozen, we may write anything Solution: Fix current callers and document that iterate_supers[_type]() callers must check for frozen before writing Current iterate_supers() callers: fs/buffer.c: iterate_supers(do_thaw_one, NULL); - NO: emergency thaw - we do want to ignore the frozen fs/drop_caches.c: iterate_supers(drop_pagecache_sb, NULL); - NO: No dirty inodes, no writes should occur, but documented anyway fs/drop_caches.c: drop_slab() - MAYBE: Freeing inodes can dirty inodes, e.g. XFS shrinker. Documented that this kind of shrinking requires a frozen check first. fs/quota/quota.c: iterate_supers(quota_sync_one, &type); - YES: fs is mounted, can be frozen, depends on behavior of fs quota sync Solution: check for frozen in quota_sync_one() fs/sync.c: iterate_supers(sync_one_sb, &wait); - YES: fs is mounted, can be frozen, quota_sync and sync_fs called Nowhere is it written, quota ops can't write, sync can't write - Solution: Check for frozen in sync_one_sb() security/selinux/hooks.c: iterate_supers(delayed_superblock_init, NULL); - MAYBE: Let SELinux people figure this one out - grabbed by get_super() and kept until dropped by caller YES: fs is mounted, can be frozen, caller may write Current callers: fs/quota/quota.c: block-based quota ops: YES: fs is mounted, can be frozen, most quota ops may write Solution: check in quotactl() fs/block_dev.c: fsync_bdev() YES: fs is mounted, can be frozen, may write Solution: Existing check in sync_filesystem() fs/block_dev.c: freeze_bdev() NO: Caller immediately drops the s_umount ref via drop_super() fs/statfs.c: statfs() via user_get_super() YES: fs is mounted, can be frozen, ->statfs may write Solution: ->statfs() shouldn't write - grabbed during emergency remount YES: fs is mounted, can be frozen, we may write the superblock Solution: already checks for frozen in do_remount_sb() - grabbed during freeze_super() (included for completeness) YES: fs is mounted, can be frozen, we may write the superblock Solution: Limit writes to following: Sync out existing writes (sync_filesystem() call) with sb->s_frozen set to SB_FREEZE_WRITE (be sure write_inodes_sb() can complete in this situation), then set sb->s_frozen to SB_FREEZE_TRANS and write out superblock. - grabbed during thaw_super() (included for completeness) YES: fs is mounted, can be frozen, we may write the superblock Solution: Be sure the ->unfreeze_fs() op can complete with SB_FREEZE_TRANS set. fs/sync.c - grabbed by syncfs(), single fs sync YES: fs is mounted, can be frozen, we may write the superblock Solution: Should call sync_one_sb(), actually, but with error return, Jan has patches touching this so make minimum change - grabbed eventually through sync_fileystem() YES: fs is mounted, can be frozen, we may write the superblock Solution: Document that all callers check for frozen-ness before calling this File system-specific uses fs/ubifs/budget.c - grabbed during sync to call writeback_inodes_sb() YES: fs is mounted, can be frozen, we may write the superblock Solution: check for frozen in writeback_inodes_sb() fs/btrfs/volumes.c - grabbed during "seeding" of a bdev NO: we open the device with O_EXCL, no fs on it already fs/cachefiles/interface.c - grabbed during sync, calls sync_filesystem() XXX not sure? Only equipped to deal with EIO fs/nfs/super.c - grabbed during nfs_follow_remote_path() NO: just following path, no write will occur (hopefully) fs/nilfs2/ioctl.c - held during change of fs mode, implemented as fs transaction XXX think this needs freeze awareness in general? fs/reiserfs/procfs.c - dropped after get_super() call in /proc operation XXX don't know, need a reiser expert