On Fri, Mar 09, 2012 at 03:22:53PM +0100, Jan Kara wrote: > On Fri 09-03-12 10:20:41, Dave Chinner wrote: > > On Mon, Mar 05, 2012 at 05:01:09PM +0100, Jan Kara wrote: > > > Generic code now blocks all writers from standard write paths. So we block all > > > writers coming from ioctl and replace blocking of transactions on frozen > > > filesystem with a debugging check. As a bonus, we get a protection of ioctl > > > against racing remount read-only. We also convert xfs_file_aio_write() to a > > > non-racy freeze protection. > > .... > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > index 329b06a..6468a2a 100644 > > > --- a/fs/xfs/xfs_trans.c > > > +++ b/fs/xfs/xfs_trans.c > > > @@ -577,7 +577,6 @@ xfs_trans_alloc( > > > xfs_mount_t *mp, > > > uint type) > > > { > > > - xfs_wait_for_freeze(mp, SB_FREEZE_TRANS); > > > return _xfs_trans_alloc(mp, type, KM_SLEEP); > > > } > > > > So what is there to stop internal XFS threads from starting > > transactions when the filesystem is frozen? Previously this > > SB_FREEZE_TRANS would guarantee even internal fucntions would get > > stopped, but now there's nothing? > I've checked XFS now and I didn't find any work that would be done on a > clean filesystem. I found: > _xfs_mru_cache_reap() - doesn't seem to do any IO Causes iput() on inodes, which if dropping the last reference to the inode can cause them to enter reclaim and hence have transactions run on them to either truncate blocks beyonds EOF or to do the second phase of an unlink(). So it will definitely break the freeze model. > xfs_end_io(), xfs_buf_iodone_work() - shouldn't be a problem either since only > reads can happen on a frozen filesystem > xfs_reclaim_worker() - everything is clean so no IO should happen It will do work as other things push inodes into reclaim. e.g. filestreams inode expiry. And it will still run to reclaim clean inodes... > xfs_flush_worker() - everything is clean so no IO should happen > xfs_sync_worker() - again the same if I understand the code right xfs_sync_worker() will always trigger a log force, so if there is anything that has dirtied the journal, it will trigger IO. We have no protection against the journal being dirtied anymore, so no guarantees can be given here. Basically, your patchset creates a "shell" around the outside of the filesystem that catches all the external modifications that can occur through the VFS and ioctls. The "shell" is now the only layer of defense because the patchset removes the layer of protection that prevents internal modifications from occurring. For example, the XFS patch added a bunch of protection to ioctls on XFS that only modify file metadata and so never were protected against data freezes - they were all implicitly protected by the inner transaction subsystem freeze. All of the above cases were protected by the inner layer of defense, which is now gone. Not to mention the shrinkers. The inode cache shrinkers can cause inodes to enter reclaim which can trigger EOF truncation just like _xfs_mru_cache_reap(). The dquot cache shrinker can also issue IO, and dquots will be dirtied by EOF truncation. Same with buffers, and the buffer cache shrinker. I'm sure other filesystems have just as complex or even more complex internal paths that trigger dirtying and IO that the "freeze shell" model does not prevent. I don't think auditing is good enough - the shell model is, IMO, too easy to break and too hard to test for breakage... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html