On Wed 31-07-13 14:15:43, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > When competing sync(2) calls walk the same filesystem, they need to > walk the list of inodes on the superblock to find all the inodes > that we need to wait for IO completion on. However, when multiple > wait_sb_inodes() calls do this at the same time, they contend on the > the inode_sb_list_lock and the contention causes system wide > slowdowns. In effect, concurrent sync(2) calls can take longer and > burn more CPU than if they were serialised. > > Stop the worst of the contention by adding a per-sb mutex to wrap > around wait_sb_inodes() so that we only execute one sync(2) IO > completion walk per superblock superblock at a time and hence avoid > contention being triggered by concurrent sync(2) calls. The patch looks OK. You can add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/fs-writeback.c | 11 +++++++++++ > fs/super.c | 1 + > include/linux/fs.h | 2 ++ > 3 files changed, 14 insertions(+) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index ca66dc8..56272ec 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1207,6 +1207,15 @@ out_unlock_inode: > } > EXPORT_SYMBOL(__mark_inode_dirty); > > +/* > + * The @s_sync_lock is used to serialise concurrent sync operations > + * to avoid lock contention problems with concurrent wait_sb_inodes() calls. > + * Concurrent callers will block on the s_sync_lock rather than doing contending > + * walks. The queueing maintains sync(2) required behaviour as all the IO that > + * has been issued up to the time this function is enter is guaranteed to be > + * completed by the time we have gained the lock and waited for all IO that is > + * in progress regardless of the order callers are granted the lock. > + */ > static void wait_sb_inodes(struct super_block *sb) > { > struct inode *inode, *old_inode = NULL; > @@ -1217,6 +1226,7 @@ static void wait_sb_inodes(struct super_block *sb) > */ > WARN_ON(!rwsem_is_locked(&sb->s_umount)); > > + mutex_lock(&sb->s_sync_lock); > spin_lock(&sb->s_inode_list_lock); > > /* > @@ -1258,6 +1268,7 @@ static void wait_sb_inodes(struct super_block *sb) > } > spin_unlock(&sb->s_inode_list_lock); > iput(old_inode); > + mutex_unlock(&sb->s_sync_lock); > } > > /** > diff --git a/fs/super.c b/fs/super.c > index d4d753e..7f98fd6 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -200,6 +200,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags) > s->s_bdi = &default_backing_dev_info; > INIT_HLIST_NODE(&s->s_instances); > INIT_HLIST_BL_HEAD(&s->s_anon); > + mutex_init(&s->s_sync_lock); > INIT_LIST_HEAD(&s->s_inodes); > spin_lock_init(&s->s_inode_list_lock); > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 923b465..971e8be 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1321,6 +1321,8 @@ struct super_block { > /* Being remounted read-only */ > int s_readonly_remount; > > + struct mutex s_sync_lock; /* sync serialisation lock */ > + > /* > * Keep the lru lists last in the structure so they always sit on their > * own individual cachelines. > -- > 1.8.3.2 > -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html