On Wed 08-02-17 17:27:58, Vivek Trivedi wrote: > It has been observed that apps may block in sys_sync for long time if there > is parallel mount request for large size storage block device in a loaded > environment. > > For example, sys_sync is reported to be hunged when a large size disk > (e.g. 4TB/8TB disks) is connected. sys_mount may take time for reading large > amount of metedata - free space accounting by reading bitmap during mount. > The larger the volume, the larger the size of the bitmaps read during mount. > > During mount operation s_umount lock is taken by sys_mount. The lock is not > released till the mount is completed. The sync() process keeps on waiting till > s_umount is relased by sys_mount. > > Scenario > Process1 Process2 > sys_sync() sys_mount() > iterate_supers do_mount() > ... vfs_kern_mount() > mount_fs() (Filesystem_mount) > ... mount_bdev() > sget() > alloc_super() > down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); => LOCK HELD by MOUNT > ... > down_read(&sb->s_umount); => WAITING FOR LOCK ... > . ... > . fill_super() -> time TAKING (as per filesystem) > . up_write(&sb->s_umount); => LOCK RELEASE by mount process > . > . STUCKED TILL MOUNT is completed > > Since, the superblock gets added to the list during the mount path alloc_super, > so the 'sb' is visible in the s_list. But the behaviour of waiting to sync() a > filesystem which is not active yet, seems ambigous here. > > To avoid this issue, may be we should consider about having to check only > the ACTIVE filesystem for doing operations from the superblock list. > 'sb' is valid and referencable as long as part of the s_list and MS_ACTIVE is > set after successful mount and cleared during the umount path from > generic_shutdown_super. > > --- > Fixed a typo and updated the condition in iterate_supers. Changelog of a patch belongs under the diffstat so that it does not get into the final commit log. > Signed-off-by: Vivek Trivedi <t.vivek@xxxxxxxxxxx> > Reviewed-by: Amit Sahrawat <a.sahrawat@xxxxxxxxxxx> > --- > fs/super.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/super.c b/fs/super.c > index b8b6a08..01711a4 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -587,6 +587,10 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg) > list_for_each_entry(sb, &super_blocks, s_list) { > if (hlist_unhashed(&sb->s_instances)) > continue; > + > + if (!(sb->s_flags & MS_ACTIVE)) > + continue; > + So I have two comments to this: 1) Using MS_BORN would be more appropriate here - and consistent with the fact that we do check it in iterate_supers(), just under s_umount. 2) iterate_supers_type() should be updated as well to keep them consistent. Otherwise the patch looks OK to me. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR