> > 在 2018年1月15日,下午8:27,Amir Goldstein <amir73il@xxxxxxxxx> 写道: > > On Mon, Jan 15, 2018 at 2:19 PM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote: >>> >>> 在 2018年1月15日,下午7:30,Amir Goldstein <amir73il@xxxxxxxxx> 写道: >>> >>> On Mon, Jan 15, 2018 at 9:33 AM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote: >>>> Currently syncfs(2) call on overlayfs just simply call sync_filesystem() >>>> on upper_sb to synchronize whole dirty inodes in upper filesystem >>>> regardless of the overlay ownership of the inode. In the use case of >>>> container, when multiple containers using the same underlying upper >>>> filesystem, it has some shortcomings as below. >>>> >>>> (1) Performance >>>> Synchronization is probably heavy because it actually syncs unnecessary >>>> inodes for target overlayfs. >>>> >>>> (2) Interference >>>> Unplanned synchronization will probably impact IO performance of >>>> irrelative container processes on the other overlayfs. >>> >>> s/irrelative/unrelated >>> >>>> >>>> This patch iterates upper inode list in overlayfs to only sync target >>>> dirty inodes and wait for completion. By doing this, It is able to reduce >>>> cost of synchronization and will not seriously impact IO performance of >>>> irrelative processes. In special case, when having very few dirty inodes >>> >>> unrelated processes >>> >>>> and a lot of clean upper inodes in overlayfs, then iteration may waste >>>> time so that synchronization is slower than before, but we think the >>>> potential for performance improvement far surpass the potential for >>>> performance regression in most cases. >>>> >>>> Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxx> >>> >>> Beyond Miklos's comment on removing inodes from list, a few more >>> comments and nits below. >>> >>>> --- > [...] >>>> +/** >>>> + * ovl_sync_filesystem >>>> + * @sb: The overlayfs super block >>>> + * >>>> + * Sync underlying dirty inodes in upper filesystem and wait for completion. >>>> + */ >>>> +static int ovl_sync_filesystem(struct super_block *sb) >>>> +{ >>>> + struct ovl_fs *ofs = sb->s_fs_info; >>>> + struct super_block *upper_sb = ofs->upper_mnt->mnt_sb; >>>> + >>>> + ovl_sync_inodes(sb); >>>> + ovl_wait_inodes(sb); >>>> + >>>> + if (upper_sb->s_op->sync_fs) >>>> + upper_sb->s_op->sync_fs(upper_sb, 1); >>>> + >>>> + return sync_blockdev(upper_sb->s_bdev); >>>> +} >>>> + >>>> /* Sync real dirty inodes in upper filesystem (if it exists) */ >>>> static int ovl_sync_fs(struct super_block *sb, int wait) >>>> { >>>> @@ -266,17 +422,19 @@ static int ovl_sync_fs(struct super_block *sb, int wait) >>>> * If this is a sync(2) call or an emergency sync, all the super blocks >>>> * will be iterated, including upper_sb, so no need to do anything. >>>> * >>>> - * If this is a syncfs(2) call, then we do need to call >>>> - * sync_filesystem() on upper_sb, but enough if we do it when being >>>> + * If this is a syncfs(2) call, then we do need to call sync_inode() >>>> + * on underlying dirty upper_inode, but enough if we do it when being >>>> * called with wait == 1. >>>> */ >>>> if (!wait) >>>> return 0; >>>> >>>> upper_sb = ofs->upper_mnt->mnt_sb; >>>> + if (sb_rdonly(upper_sb)) >>>> + return 0; >>>> >>>> down_read(&upper_sb->s_umount); >>>> - ret = sync_filesystem(upper_sb); >>>> + ret = ovl_sync_filesystem(sb); >>>> up_read(&upper_sb->s_umount); >>> >>> The boundaries between ovl_sync_fs() and ovl_sync_filesystem() are now very >>> strange.. taking upper_sb s_umount lock outside and passing in sb etc. >>> I am not sure that ovl_sync/wait_inodes() should be called under upper_sb >>> s_umount lock at all. Remember we are already holding sb s_umount lock. >>> I suggest, now that ovl_sync/wait_inodes() have been factored out, to squash >>> ovl_sync_filesystem() back into ovl_sync_fs() and hold the s_umount lock >>> only around s_op->sync_fs() and sync_blockdev(). >> >> From the comment in sync_filesystem, s_umount is used for against the filesystem going >> from r/o to r/w or vice versa. so I think it is necessary and should move readonly >> check inside ovl_sync_filesystem() to get the protection. >> > > Right, but the interface between the 2 functions looks weird, so > please either get rid of > ovl_sync_filesystem() entirely and move its code into ovl_sync_fs() > under the s_umount > lock where needed or move the s_umount lock and sb_rdonly() test into > ovl_sync_filesystem(sb), which leaves ovl_sync_fs() pretty slim. OKAY, I take the latter option. Thanks, Chengguang. -- To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html