On Thu, 2008-05-01 at 22:44 +0530, Aneesh Kumar K.V wrote: > On Wed, Apr 30, 2008 at 03:41:10PM +0200, Valerie Clement wrote: > > mballoc: fix mb_normalize_request algorithm for 1KB block size filesystems > > > > From: Valerie Clement <valerie.clement@xxxxxxxx> > > > > In case of inode preallocation, the number of blocks to allocate depends > > on the file size and it is calculated in ext4_mb_normalize_group_request(). > > Each group in the filesystem is then checked to find one that can be used > > for allocation; this is done in ext4_mb_good_group(). > > > > When a file bigger than 4MB is created, the requested number of blocks to > > preallocate, calculated by ext4_mb_normalize_group_request is 4096. > > However for a filesystem with 1KB block size, the maximum size of the > > block buddies used by the multiblock allocator is 2048, so none of > > groups in the filesystem satisfies the search criteria in > > ext4_mb_good_group(). Scanning all the filesystem groups impacts > > performance. > > s/ext4_mb_normalize_group_request/ext4_mb_normalize_request/ > > > That's true the max order is block_size_bits + 1 > Can you update the commit message with the above information ? > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > > Ok, Updated patch queue with the comment changes. Also fixed a small checkpatch warning:-) Mingming > > > > The following numbers show that: > > - on an ext4 FS with 1KB block size mounted with nodelalloc option: > > # dd if=/dev/zero of=/mnt/test/foo bs=8k count=1k conv=fsync > > 1024+0 records in > > 1024+0 records out > > 8388608 bytes (8.4 MB) copied, 35.5091 seconds, 236 kB/s > > > > - on an ext4 FS with 1KB block size mounted with nodelalloc and nomballoc > > options: > > # dd if=/dev/zero of=/mnt/test/foo bs=8k count=1k conv=fsync > > 1024+0 records in > > 1024+0 records out > > 8388608 bytes (8.4 MB) copied, 0.233754 seconds, 35.9 MB/s > > > > In the two cases, dd is done after creating the FS with -b1024 option, > > mounting the FS with the options specified before and flushing all caches > > using echo 3 > /proc/sys/vm/drop_caches. > > The partition size is 70GB. > > I did the same test on a 1TB partition, it took several minutes to write > > 8MB! > > > > This patch modifies the algorithm in ext4_mb_normalize_group_request to > > calculate the number of blocks to allocate by taking into account the > > maximum size of free blocks chunks handled by the multiblock allocator. > > > > It has also been tested for filesystems with 2KB and 4KB block sizes to > > ensure that those cases don't regress. > > > > Signed-off-by: Valerie Clement <valerie.clement@xxxxxxxx> > > > > --- > > > > mballoc.c | 19 +++++++++---------- > > 1 file changed, 9 insertions(+), 10 deletions(-) > > > > Index: linux-2.6.25/fs/ext4/mballoc.c > > =================================================================== > > --- linux-2.6.25.orig/fs/ext4/mballoc.c 2008-04-25 16:19:32.000000000 +0200 > > +++ linux-2.6.25/fs/ext4/mballoc.c 2008-04-25 16:49:34.000000000 +0200 > > @@ -2905,12 +2905,11 @@ ext4_mb_normalize_request(struct ext4_al > > if (size < i_size_read(ac->ac_inode)) > > size = i_size_read(ac->ac_inode); > > > > - /* max available blocks in a free group */ > > - max = EXT4_BLOCKS_PER_GROUP(ac->ac_sb) - 1 - 1 - > > - EXT4_SB(ac->ac_sb)->s_itb_per_group; > > + /* max size of free chunks */ > > + max = 2 << bsbits; > > > > -#define NRL_CHECK_SIZE(req, size, max,bits) \ > > - (req <= (size) || max <= ((size) >> bits)) > > +#define NRL_CHECK_SIZE(req, size, max, chunk_size) \ > > + (req <= (size) || max <= (chunk_size)) > > > > /* first, try to predict filesize */ > > /* XXX: should this table be tunable? */ > > @@ -2929,16 +2928,16 @@ ext4_mb_normalize_request(struct ext4_al > > size = 512 * 1024; > > } else if (size <= 1024 * 1024) { > > size = 1024 * 1024; > > - } else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, bsbits)) { > > + } else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, 2 * 1024)) { > > start_off = ((loff_t)ac->ac_o_ex.fe_logical >> > > - (20 - bsbits)) << 20; > > - size = 1024 * 1024; > > - } else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, bsbits)) { > > + (21 - bsbits)) << 21; > > + size = 2* 1024 * 1024; > > + } else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, 4 * 1024)) { > > start_off = ((loff_t)ac->ac_o_ex.fe_logical >> > > (22 - bsbits)) << 22; > > size = 4 * 1024 * 1024; > > } else if (NRL_CHECK_SIZE(ac->ac_o_ex.fe_len, > > - (8<<20)>>bsbits, max, bsbits)) { > > + (8<<20)>>bsbits, max, 8 * 1024)) { > > start_off = ((loff_t)ac->ac_o_ex.fe_logical >> > > (23 - bsbits)) << 23; > > size = 8 * 1024 * 1024; > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html