On Wed, Apr 13, 2011 at 4:45 PM, Andreas Dilger <adilger@xxxxxxxxx> wrote: > On 2011-04-12, at 7:27 AM, Amir Goldstein wrote: >> I realized another issue regarding exclude bitmap compatibility. >> the exiting EXT4_IOC_GROUP_ADD ioctl doesn't pass a field for the location >> of exclude_bitmap block, so we need to either allocate exclude_bitmap in kernel >> or define a new ioctl, which passes the exclude_bitmap to kernel. >> >> If we are going to go with the latter solution, we may want to add >> support for flex_bg layout for the new ioctl, so following is my proposal >> for EXT4_IOC_FLEX_GROUP_ADD: >> >> 1. As far as online resize and mkfs are concerned, we always allocate all >> group descriptors of a flex bg at the same time. >> >> 2. If there is not enough space for all flex_bg metadata in the last group, >> the last group will be dropped. >> >> 3. The new flex group input will assume all bitmaps of the same type are >> consecutive, so only the address of the first bitmap is needed as input. >> >> struct ext4_new_flex_group_input { >> __u32 group; /* Group number for this data */ > > This field will cause a misalignment/padding in the structure and will cause > problems with 32-bit binaries on 64-bit kernels. > > Should we add a "__u32 flags" field to this structure? I prefer flags/features > instead of versions... This will avoid the need to continually adding new > ioctls in the future. Agreed. I just copy pasted the current ext4_new_group_input with its existing flaws. > >> __u64 block_bitmap; /* Absolute block number of first >> block bitmap */ >> __u64 exclude_bitmap; /* Absolute block number of first >> exclude bitmap */ >> __u64 inode_bitmap; /* Absolute block number of first >> inode bitmap */ >> __u64 inode_table; /* Absolute block number of first >> inode table start */ >> __u32 blocks_count; /* Total number of blocks in this flex group */ >> __u16 reserved_blocks; /* Number of reserved blocks in this >> flex group */ >> __u16 flex_size; /* Number of groups in the flex group */ >> }; > > > Since the kernel only supports a single s_log_groups_per_flex value in the superblock, we should use the superblock value and change "flex_size" to be just a count of the number of groups that should be added. That means resize2fs should align its ioctls to multiples of the superblock s_log_groups_per_flex. > Agreed, let's call it group_count. In most cases, resize2fs would use the new ioctl to add *exactly* 1 flex group, with all its bitmaps and descriptors, just as the current ioctl always adds 1 group, even if not all group is made available at that time. resize2fs will need to use group_count < groups_per_flex to align existing fs to flex group. in case existing fs has 1 group and online resize extends it to 16 groups, the bitmaps for group 1-15 will all be placed in group 1. does that make sense? or would it be better to extend the fs 1 group at a time until flex group boundary in that case? > >> 4. ext4_group_extend() should be the same except we need to allow extending >> within the last flex bg, but not necessarily the last block group. >> >> >> To look at this from a different angle, if you imagine that the flex >> group is just a big group, whose bitmaps and group descriptor are flex_size >> times bigger and where ext4_new_flex_group_input encodes the info of the big >> descriptor, then this design is identical to the current implementation of >> online resize. >> >> What I will do, if you agree to this design, is use the new ioctl, >> with flex_size = 1 to pass the exclude_bitmap in online resize and enforce >> flex_size == 1 in kernel. > > If you are going to be modifying the kernel to add support for this ioctl, > why not properly add in support for flex_size > 1? because I don't have time to do it (test it) now :-( and I want to post the e2fsprogs exclude bitmap patch as soon as possible. > It should only involve > the kernel looping over the groups as they are added. This doesn't require > that resize2fs has added support for it yet, but forcing flex_size == 1 in > the kernel means that userspace will not know whether the kernel is doing > the right thing or not. > if I implement group_count > 1 in the kernel, I will need to implement it in e2fsprogs, so I can test it properly. hopefully, there will be no official kernel which enforces group_count == 1. > In order to handle future resize in the case of a partially-added flex group, > it would be desirable for the unused blocks (bitmaps, inode table) are reserved > so that they are not allocated by the block allocator. > Yes, that would be desirable, but then again, we will anyway need to handle resize of fs which was formatted/resized before my patches. Anayway, I hope I will implement proper flex_bg resize for 2.6.40, I'm just not sure I can commit to it. >> Then later we can teach resize2fs and the kernel to extend flex groups properly >> using the new ioctl. >> >> What do you think? >> Can you think of another way I can support exclude_bitmap and online resize, >> without the need for a new ioctl? >> >> Amir. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html