On Wed, 9 Feb 2011 12:05:11 +0200, "Amir G." <amir73il@xxxxxxxxxxxxxxxxxxxxx> wrote: > Hi Aneesh, > > As you are signed off on most of the recent alloc_sem related code changes, > can you please comment on the patch below, which tries to avoid taking > the read lock most of the times on a 4K block fs. > > Can anyone tell what performance impact (if any) will be caused by avoiding > the read lock on most allocations? group spin lock will still be taken, but for > much shorter periods of time (cycles). > > Any ideas how this patch can be properly tested? A quick check says the changes are correct. But i am not sure whether we want to conditionalize these locks unless they appear as highly contented locks in a profile. > > Thanks, > Amir. > > grp->alloc_sem is used to synchronize buddy cache users with buddy cache init > of other groups that use the same buddy cache page and with adding blocks to > group on online resize. > > When blocks_per_page <= 2, each group has it's own private buddy cache page > so taking the read lock for every allocation is futile and can be avoided for > every group, but the last one. > > The write lock is taken in ext4_mb_init_group() and in ext4_add_groupblocks() > to synchronize the buddy cache init of a group on first time allocation after > mount and after extending the last group. > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxxxxx> > --- > fs/ext4/mballoc.c | 19 +++++++++++++++---- > 1 files changed, 15 insertions(+), 4 deletions(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index 1b3256b..22a5251 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -1160,7 +1160,15 @@ ext4_mb_load_buddy(struct super_block *sb, > ext4_group_t group, > e4b->bd_group = group; > e4b->bd_buddy_page = NULL; > e4b->bd_bitmap_page = NULL; > - e4b->alloc_semp = &grp->alloc_sem; > + /* > + * We only need to take the read lock if other groups share the buddy > + * page with this group or if blocks may be added to this (last) group > + * by ext4_group_extend(). > + */ > + if (blocks_per_page > 2 || group == sbi->s_groups_count - 1) If we can say groups_per_page > 1 that would make it more clear. > + e4b->alloc_semp = &grp->alloc_sem; > + else > + e4b->alloc_semp = NULL; > > /* Take the read lock on the group alloc > * sem. This would make sure a parallel > @@ -1169,7 +1177,8 @@ ext4_mb_load_buddy(struct super_block *sb, > ext4_group_t group, > * till we are done with allocation > */ > repeat_load_buddy: > - down_read(e4b->alloc_semp); > + if (e4b->alloc_semp) > + down_read(e4b->alloc_semp); > > if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) { > /* we need to check for group need init flag > @@ -1177,7 +1186,8 @@ repeat_load_buddy: > * that new blocks didn't get added to the group > * when we are loading the buddy cache > */ > - up_read(e4b->alloc_semp); > + if (e4b->alloc_semp) > + up_read(e4b->alloc_semp); > /* > * we need full data about the group > * to make a good selection > @@ -1277,7 +1287,8 @@ err: > e4b->bd_bitmap = NULL; > > /* Done with the buddy cache */ > - up_read(e4b->alloc_semp); > + if (e4b->alloc_semp) > + up_read(e4b->alloc_semp); > return ret; > } > -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html