> > Hello, Hi Jan, > > > + > > +int file_write_unfreeze(struct inode *inode) > > +{ > > + struct super_block *sb = inode->i_sb; > > + > > + if (!S_ISREG(inode->i_mode)) > > + return -EINVAL; > > + > > + spin_lock(&inode->i_lock); > > + > > + if (!(inode->i_state & I_WRITE_FREEZED)) { > > + spin_unlock(&inode->i_lock); > > + return -EINVAL; > > + } > > + > > + inode->i_state &= ~I_WRITE_FREEZED; > > + smp_wmb(); > > + wake_up(&sb->s_writers.wait_unfrozen); > > + spin_unlock(&inode->i_lock); > > + return 0; > > +} > > +EXPORT_SYMBOL(file_write_unfreeze); > So I was looking at the implementation and I have a few comments: > 1) The trick with freezing superblock looks nice but I'm somewhat worried > that if we wanted to heavily use per-inode freezing to defrag the whole > filesystem it may be too slow to freeze the whole fs, mark one inode as > frozen and then unfreeze the fs. But I guess we'll see that once have some > reasonably working implementation. Dmitry has given a good idea to avoid multiple freeze fs and unfreeze fs calls. ioctl(sb,FIFREEZE) while (f = pop(files_list)) ioctl(f,FS_IOC_FWFREEZE) ioctl(sb,FITHAW) In file_write_freeze, we could first check if the fs is already frozen, if yes than we can directly set inode write freeze state after taking relevant lock to prevent fs_thaw while the inode state is being set. > > 2) The tests you are currently doing are racy. If > things happen as: > CPU1 CPU2 > inode_start_write() > file_write_freeze() > sb_start_pagefault() > Do modifications. > > Then you have a CPU modifying a file while file_write_freeze() has > succeeded so it should be frozen. > > If you swap inode_start_write() with sb_start_pagefault() the above race > doesn't happen but userspace program has to be really careful not to hit a > deadlock. E.g. if you tried to freeze two inodes the following could happen: > CPU1 CPU2 > file_write_freeze(inode1) > fault on inode1: > sb_start_pagefault() > inode_start_write() -> blocks > file_write_freeze(inode2) > blocks in freeze_super() > > So I don't think this is a good scheme for inode freezing... To solve this race, we can fold inode_start_write with sb_start_write and use similar appraoch of __sb_start_write. How about the below scheme ? void inode_start_write(struct inode *inode) { struct super_block *sb = inode->i_sb; retry: if (unlikely(inode->i_state & I_WRITE_FREEZED)) { DEFINE_WAIT(wait); prepare_to_wait(&sb->s_writers.wait_unfrozen, &wait, TASK_UNINTERRUPTIBLE); schedule(); finish_wait(&sb->s_writers.wait_unfrozen, &wait); goto retry; } sb_start_write(sb); /* check if file_write_freeze race with us */ if (unlikely(inode->i_state & I_WRITE_FREEZED) { sb_end_write(sb); goto retry; } } Thanks for your review! > > Honza > -- > Jan Kara <jack@xxxxxxx> > SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html