On 07/01/2011 06:09 PM, Lukas Czerner wrote: > On Thu, 30 Jun 2011, Tao Ma wrote: > >> From: Tao Ma <boyu.mt@xxxxxxxxxx> >> >> In ext4, when FITRIM is called every time, we iterate all the >> groups and do trim one by one. It is a bit time wasting if the >> group has been trimmed and there is no change since the last >> trim. >> >> So this patch adds a new flag in ext4_group_info->bb_state to >> indicate that the group has been trimmed, and it will be cleared >> if some blocks is freed(in release_blocks_on_commit). Another >> trim_minlen is added in ext4_sb_info to record the last minlen >> we use to trim the volume, so that if the caller provide a small >> one, we will go on the trim regardless of the bb_state. >> >> A simple test with my intel x25m ssd: >> df -h shows: >> /dev/sdb1 40G 21G 17G 56% /mnt/ext4 >> Block size: 4096 >> >> run the FITRIM with the following parameter: >> range.start = 0; >> range.len = UINT64_MAX; >> range.minlen = 1048576; >> >> without the patch: >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m5.505s >> user 0m0.000s >> sys 0m1.224s >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m5.359s >> user 0m0.000s >> sys 0m1.178s >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m5.228s >> user 0m0.000s >> sys 0m1.151s >> >> with the patch: >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m5.625s >> user 0m0.000s >> sys 0m1.269s >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m0.002s >> user 0m0.000s >> sys 0m0.001s >> [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a >> real 0m0.002s >> user 0m0.000s >> sys 0m0.001s >> >> A big improvement for the 2nd and 3rd run. >> >> Even after I delete some big image files, it is still much >> faster than iterating the whole disk. >> >> [root@boyu-tm test]# time ./ftrim /mnt/ext4/a >> real 0m1.217s >> user 0m0.000s >> sys 0m0.196s > > Thanks for the patch Tao, I have couple of comments bellow. > >> >> Reviewed-by: Andreas Dilger <adilger.kernel@xxxxxxxxx> >> Signed-off-by: Tao Ma <boyu.mt@xxxxxxxxxx> >> --- >> fs/ext4/ext4.h | 13 ++++++++++++- >> fs/ext4/mballoc.c | 20 ++++++++++++++++++++ >> fs/ext4/super.c | 2 ++ >> 3 files changed, 34 insertions(+), 1 deletions(-) >> >> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h >> index 1921392..5878a22 100644 >> --- a/fs/ext4/ext4.h >> +++ b/fs/ext4/ext4.h >> @@ -1214,6 +1214,9 @@ struct ext4_sb_info { >> >> /* Kernel thread for multiple mount protection */ >> struct task_struct *s_mmp_tsk; >> + >> + /* record the last minlen when FITRIM is called. */ >> + atomic_t s_last_trim_minblks; >> }; >> >> static inline struct ext4_sb_info *EXT4_SB(struct super_block *sb) >> @@ -2067,11 +2070,19 @@ struct ext4_group_info { >> * 5 free 8-block regions. */ >> }; >> >> -#define EXT4_GROUP_INFO_NEED_INIT_BIT 0 >> +#define EXT4_GROUP_INFO_NEED_INIT_BIT 0 >> +#define EXT4_GROUP_INFO_WAS_TRIMMED_BIT 1 >> >> #define EXT4_MB_GRP_NEED_INIT(grp) \ >> (test_bit(EXT4_GROUP_INFO_NEED_INIT_BIT, &((grp)->bb_state))) >> >> +#define EXT4_MB_GRP_WAS_TRIMMED(grp) \ >> + (test_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state))) >> +#define EXT4_MB_GRP_SET_TRIMMED(grp) \ >> + (set_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state))) >> +#define EXT4_MB_GRP_CLEAR_TRIMMED(grp) \ >> + (clear_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state))) >> + >> #define EXT4_MAX_CONTENTION 8 >> #define EXT4_CONTENTION_THRESHOLD 2 >> >> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c >> index a822f2a..afdd869 100644 >> --- a/fs/ext4/mballoc.c >> +++ b/fs/ext4/mballoc.c >> @@ -2628,6 +2628,15 @@ static void release_blocks_on_commit(journal_t *journal, transaction_t *txn) >> rb_erase(&entry->node, &(db->bb_free_root)); >> mb_free_blocks(NULL, &e4b, entry->start_blk, entry->count); >> >> + /* >> + * Clear the trimmed flag for the group so that the next >> + * ext4_trim_fs can trim it. >> + * If the volume is mounted with -o discard, online discard >> + * is supported and the free blocks will be trimmed online. >> + */ >> + if (!test_opt(sb, DISCARD)) >> + EXT4_MB_GRP_CLEAR_TRIMMED(db); > > If the online discard failed we should probably clear the flag as well > right ? However I am not sure how helpful it would be since it'll most > likely fail again. Maybe we do not need to bother with that case at all, > what do you think ? yes, in this case, we don't have a minor chance to trim successfully later and it will increase the time of the next FITRIM ioctl. So I guess we can leave it as-is. > >> + >> if (!db->bb_free_root.rb_node) { >> /* No more items in the per group rb tree >> * balance refcounts from ext4_mb_free_metadata() >> @@ -4840,6 +4849,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group, >> bitmap = e4b.bd_bitmap; >> >> ext4_lock_group(sb, group); >> + if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) && >> + minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks)) >> + goto out; >> + >> start = (e4b.bd_info->bb_first_free > start) ? >> e4b.bd_info->bb_first_free : start; >> >> @@ -4871,6 +4884,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group, >> if ((e4b.bd_info->bb_free - free_count) < minblocks) >> break; >> } >> + >> + if (!ret) >> + EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info); >> +out: >> ext4_unlock_group(sb, group); >> ext4_mb_unload_buddy(&e4b); >> >> @@ -4960,6 +4977,9 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range) >> } >> range->len = trimmed * sb->s_blocksize; >> >> + if (!ret) >> + atomic_set(&EXT4_SB(sb)->s_last_trim_minblks, minlen); >> + >> out: >> return ret; >> } >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index 9ea71aa..080e9f9 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -3086,6 +3086,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >> sbi->s_sectors_written_start = >> part_stat_read(sb->s_bdev->bd_part, sectors[1]); >> >> + atomic_set(&sbi->s_last_trim_minblks, 0); >> + > > I am not sure that this is needed since sbi is allocated with kzalloc, > so it should be zero already. yes, maybe. I can remove it if you like. Thanks Tao > >> /* Cleanup superblock name */ >> for (cp = sb->s_id; (cp = strchr(cp, '/'));) >> *cp = '!'; >> > > Thanks! > -Lukas > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html