[RFC PATCH 0/3] VFS: Fix s_umount thaw/write deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been working on some patches to fix the following problem:

File system freeze/thaw require the superblock's s_umount lock.
However, we write to the file system while holding the s_umount lock
in several places (e.g., writeback and quotas).  Any of these can
block on a frozen file system while holding s_umount, preventing any
later thaw from occurring, since thaw requires s_umount.  The solution
is to add a check, vfs_is_frozen(), to all code paths that write while
holding sb->s_umount and return without writing if it is true.

Attached is an audit of each use of ->s_umount and whether it could
result in a deadlock with freeze/thaw.  Some of this has gone into the
code comments.  Patches against some random post 3.0.0 pull to follow.

A patch to xfstests is coming.  Basically, we need to test for mmaped
writes concurrent with file system freeze.  Probably test 068 can be
extended. (If anyone is interested in doing this, it is relatively
simple work.)

Thanks, Jan Kara and Dave Chinner for their kind suggestions for fixes
and review of patches offline.

-VAL
Actual places s_umount is grabbed

fs/fs-writeback.c
 - grabbed by writeback_inodes_sb_if_idle()
   YES: fs is mounted, can be frozen, we may write an inode
   Called only by ext4 when space is getting low, return value ignored
   Solution: down_read_trylock() (also writeback_sb_inodes() checks
   for frozen)

 - grabbed by writeback_inodes_sb_nr_if_idle()
   YES: fs is mounted, can be frozen, we may write an inode
   Called only by btrfs when it is trying to free space, return value ignored
   Solution: down_read_trylock() (also writeback_sb_inodes() checks
   for frozen)

 - grabbed in wb_do_writeback() by grab_super_passive()
   YES: fs is mounted, can be frozen, we may write an inode
   Solution: writeback_sb_inodes() checks for frozen and returns
   Note: there was a livelock in wb_do_writeback() at grab_super_passive()

fs/namespace.c
 - grabbed by do_emergency_remount()
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: already checks for frozen in do_remount_sb()

 - grabbed by do_remount()
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: already checks for frozen in do_remount_sb()

 - grabbed by mount_bdev()
   NO: this is only called on first mount, fs not visible, can't be frozen

fs/super.c
 - grabbed by sync_supers()
   NO: We only write super block if s->s_dirt is set

 - grabbed by iterate_supers()/iterate_supers_type()
   YES: fs is mounted, can be frozen, we may write anything

   Solution: Fix current callers and document that
   iterate_supers[_type]() callers must check for frozen before
   writing

   Current iterate_supers() callers:
   fs/buffer.c: iterate_supers(do_thaw_one, NULL);
    - NO: emergency thaw - we do want to ignore the frozen
   fs/drop_caches.c: iterate_supers(drop_pagecache_sb, NULL);
    - NO: No dirty inodes, no writes should occur, but documented anyway
   fs/drop_caches.c: drop_slab()
    - MAYBE: Freeing inodes can dirty inodes, e.g. XFS shrinker.
    Documented that this kind of shrinking requires a frozen check first.

   fs/quota/quota.c: iterate_supers(quota_sync_one, &type);
    - YES: fs is mounted, can be frozen, depends on behavior of fs
   quota sync
    Solution: check for frozen in quota_sync_one()
 
  fs/sync.c: iterate_supers(sync_one_sb, &wait);
    - YES: fs is mounted, can be frozen, quota_sync and sync_fs called
    Nowhere is it written, quota ops can't write, sync can't write
    - Solution: Check for frozen in sync_one_sb()
   security/selinux/hooks.c: iterate_supers(delayed_superblock_init, NULL);
    - MAYBE: Let SELinux people figure this one out

 - grabbed by get_super() and kept until dropped by caller
   YES: fs is mounted, can be frozen, caller may write

   Current callers:
   fs/quota/quota.c: block-based quota ops:
   YES: fs is mounted, can be frozen, most quota ops may write
   Solution: check in quotactl()
   fs/block_dev.c: fsync_bdev()
   YES: fs is mounted, can be frozen, may write
   Solution: Existing check in sync_filesystem()
   fs/block_dev.c: freeze_bdev()
   NO: Caller immediately drops the s_umount ref via drop_super()
   fs/statfs.c: statfs() via user_get_super()
   YES: fs is mounted, can be frozen, ->statfs may write
   Solution: ->statfs() shouldn't write
 
 - grabbed during emergency remount
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: already checks for frozen in do_remount_sb()
 - grabbed during freeze_super() (included for completeness)
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: Limit writes to following: Sync out existing writes
   (sync_filesystem() call) with sb->s_frozen set to SB_FREEZE_WRITE
   (be sure write_inodes_sb() can complete in this situation), then
   set sb->s_frozen to SB_FREEZE_TRANS and write out superblock.
 - grabbed during thaw_super() (included for completeness)
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: Be sure the ->unfreeze_fs() op can complete with
   SB_FREEZE_TRANS set.

fs/sync.c
 - grabbed by syncfs(), single fs sync
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: Should call sync_one_sb(), actually, but with error
   return, Jan has patches touching this so make minimum change
   
 - grabbed eventually through sync_fileystem()
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: Document that all callers check for frozen-ness before calling this

File system-specific uses

fs/ubifs/budget.c
 - grabbed during sync to call writeback_inodes_sb()
   YES: fs is mounted, can be frozen, we may write the superblock
   Solution: check for frozen in writeback_inodes_sb()
fs/btrfs/volumes.c
 - grabbed during "seeding" of a bdev
   NO: we open the device with O_EXCL, no fs on it already
fs/cachefiles/interface.c
 - grabbed during sync, calls sync_filesystem()
   XXX not sure? Only equipped to deal with EIO
fs/nfs/super.c
 - grabbed during nfs_follow_remote_path()
   NO: just following path, no write will occur (hopefully)
fs/nilfs2/ioctl.c
 - held during change of fs mode, implemented as fs transaction
   XXX think this needs freeze awareness in general?
fs/reiserfs/procfs.c
 - dropped after get_super() call in /proc operation
   XXX don't know, need a reiser expert

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux