On Mon, 2 Mar 2015, Eric Sandeen wrote: > Date: Mon, 02 Mar 2015 11:27:11 -0600 > From: Eric Sandeen <sandeen@xxxxxxxxxx> > To: Lukáš Czerner <lczerner@xxxxxxxxxx> > Cc: linux-ext4@xxxxxxxxxxxxxxx > Subject: Re: [PATCH] e2fsprogs: Limit number of reserved gdt blocks on small > fs > > On 3/2/15 11:08 AM, Lukáš Czerner wrote: > > On Mon, 2 Mar 2015, Eric Sandeen wrote: > > ... > > >> Some extra "making sure" in the above sentence. ;) > > > > Right, I'll fix that. > > > >> > >>> reserved_gdt_blocks is small enough. If the user > >>> + * specifies non-default values he can still run into this issue > >>> + * but we do not want to make that mistake by default. > >>> + * > >>> + * Magic numbers: > >>> + * - At block count smaller than 32768, journal will be 1024 block > >>> + * blocks long which would not be enough by default. > >>> + * - 64 reserved gtd blocks is maximum number we can fit into said > >> > >> gtd? (gdt? gdb?) > > > > Oh yes... > > > > I'll try to use it consistently. > > thanks :) > > >> > >>> + * journal by default (flex_bg_count * 4 + rsv_gdb) > >> > >> This is a little confusing; "flex_bg_count" isn't a thing, and it seems to > >> say that the reserved blocks depend on the number of reserved blocks? > > > > Does it ? It says that number of journal credits depend on the > > number of reserved gdt blocks, or maybe I am missing something. > > > > flex_bg_count is a number of groups in flex group, I'll try to come > > up with a better name. > > It just looks like a variable name; it's one that doesn't exist, so it's > not clear what you're talking about. > > >> > >> I'm curious, why does this matter only for 1k blocks, and not, say 2k blocks? > > > > That's because of the way we compute the number of reserved GDT > > blocks. The number of GDT blocks depends on the number of block > > groups, which depends on the number of blocks in the file system. > > And with 1k file system we have much more blocks. In fact we have: > > > > - twice as many blocks as with 2k file system > > - four times the block groups > > > > And we can also fit only half of the number of group descriptors > > into a single block which means that we need twice as many blocks > > per block group. > > > > Which means that with 1k file system we need 8 times as many gdt > > blocks as with 2k fs and 2048 times more gdt blocks as with 4k with > > the same setup. > > > >> > >>> + */ > >>> + if ((fs->blocksize == 1024) && (ext2fs_blocks_count(sb) < 32768) && > >>> + (rsv_gdb > 64)) > >>> + rsv_gdb = 64; > >>> + > >>> if (rsv_gdb > EXT2_ADDR_PER_BLOCK(sb)) > >>> rsv_gdb = EXT2_ADDR_PER_BLOCK(sb); > >>> #ifdef RES_GDT_DEBUG > >>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c > >>> index aeb852f..e9edd3b 100644 > >>> --- a/misc/mke2fs.c > >>> +++ b/misc/mke2fs.c > >>> @@ -3076,6 +3076,34 @@ int main (int argc, char *argv[]) > >>> } > >>> if (!quiet) > >>> printf("%s", _("done\n")); > >>> + > >>> + /* > >>> + * In online resize we require a huge number of journal > >>> + * credits mainly because of the reserved_gdt_blocks. The > >>> + * exact number of journal credits needed is: > >>> + * flex_gd->count * 4 + reserved_gdb > >>> + * > >>> + * Warn user if we will not have enough credits to resize > >>> + * the file system online. > >>> + */ > >>> + if (fs_param.s_feature_compat & EXT2_FEATURE_COMPAT_RESIZE_INODE) > >>> + { > >>> + unsigned int credits; > >>> + > >>> + /* > >>> + * This is the amount we're allowed to use for a > >>> + * single handle in a transaction > >>> + */ > >>> + journal_blocks /= 8; > >> > >> This seems a little dangerous; if someone decides to use journal_blocks > >> later in the function, it might be unexpectedly smaller. > >> > >>> + credits = (1 << fs_param.s_log_groups_per_flex) * 4 + > >>> + fs->super->s_reserved_gdt_blocks; > >>> + if (credits > journal_blocks) { > >> > >> so maybe just scale this by 8, i.e. "if (credits > journal_blocks/8)" > > > > Fair enough, I'll change that. > > > >> > >> And actually, I find myself wondering if this same calculation can be used > >> in calc_reserved_gdt_blocks, rather than using the magic numbers? > > > > Unfortunately it can not. At the time we're calculating number of gdt > > blocks we have no idea how big the journal will be. > > > > And at the time we already know how big the journal will be we > > already have blocks allocated for the resize_inode so we can not > > just simply change s_reserved_gdt_blocks here. > > Ugh. That sounds like a mess. > > But it feels like we need centralized functions or macros that express the > relationships between these values, rather than sprinkling "* 4" and "if 1024" > around. It seems to have the potential to get out of sync, but I've only > given it cursory thought... Yes, but that would require huge amount of rewrite I think. Mostly we're trying to think of the dependencies between various file system features and/or parameters. I think it's mostly in main() and PRS() functions however this "dependency" is entirely unexpected. I tried to think about a better way where to put it, but this is what I came with. However maybe we can determine the journal size by ext2fs_default_journal_size() and use the default journal size instead ? -Lukas > > -Eric >