Thanks a lot for the comments and suggestions! On Sat, Aug 29, 2009 at 7:52 PM, Theodore Tso<tytso@xxxxxxx> wrote: > On Fri, Aug 28, 2009 at 05:40:54PM -0700, Jiaying Zhang wrote: >> --- .pc/fallocate_keepsizse.patch/fs/attr.c 2009-08-28 15:38:46.000000000 -0700 >> +++ fs/attr.c 2009-08-28 17:01:04.000000000 -0700 >> @@ -68,7 +68,8 @@ int inode_setattr(struct inode * inode, >> unsigned int ia_valid = attr->ia_valid; >> >> if (ia_valid & ATTR_SIZE && >> - (attr->ia_size != i_size_read(inode)) { >> + (attr->ia_size != i_size_read(inode) || >> + (inode->i_flags & FS_KEEPSIZE_FL))) { >> int error = vmtruncate(inode, attr->ia_size); >> if (error) >> return error; > > Instead of doing this in the generic code, it really should be done in > ext4_setattr. Technically speaking, we don't actually need the > FS_KEEPSIZE_FL to solve this problem; instead we can simply have the > ext4 code look in the extent tree to see if there are any blocks > mapped beyond the logical block: > > i_size_read(inode) >> inode->i_sb->s_blocksize_bits Is it relatively cheap to scan the extent tree? Will this add the overhead to truncate? > > Having a flag as Andreas suggested does help with the issue of e2fsck > noticing whether or not i_size is incorrect (and should be fixed) or > the file has been extended. So keeping having the flag is an OK thing > to do, but we need to be careful about a particularly subtle > overloading problem. The flags FS_*_FL as defined in > include/linux/fs.h are technically only for in-memory use. The ext4 > on-disk format flags is EXT4_*_FL, and defined in ext4.h. > > The flags were originially defined for use in ext2/3/4, but later on > other filesystems adopted those flags so that e2fsprogs's chattr and > lsattr programs could be used for their filesystems as well. It just > so happens that for ext2/3/4 the on-disk encoding of those flags in > the in-memory encoding of those flags in i_flags are the same, but > that means that the flags need to be defined in both places to avoid > assignment overlaps. We also need to be clear whether the flags are > internal flags for ext4's use only, or flags meant for use by all > filesystems. This is why the testing for FS_KEEPSIZE_FL in fs/attr is > particularly bad, if the flag are going to be set in fs/ext4/extents.c. > > It's better to define the flag as EXT4_KEEPSIZE_FL, and to use it as > EXT4_KEEPSIZE_FL, but make a note of that bitfield position as being > reserved in include/linux/fs.h. Here is the modified patch based on your suggestions. I stick with the KEEPSIZE_FL approach that I think can allow us to handle the special truncation accordingly during fsck. Other file systems can also re-use this flag when they want to support fallocate with KEEP_SIZE. As you suggested, I moved the EXT4_KEEPSIZE_FL checking to ext4_setattr that now calls vmtruncate if the KEEPSIZE flag is set in the i_flag. Please let me know what you think about this proposed patch. Thanks a lot! Jiaying --- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c 2009-08-31 12:08:10.000000000 -0700 +++ fs/ext4/extents.c 2009-08-31 12:12:16.000000000 -0700 @@ -3095,7 +3095,13 @@ static void ext4_falloc_update_inode(str i_size_write(inode, new_size); if (new_size > EXT4_I(inode)->i_disksize) ext4_update_i_disksize(inode, new_size); + inode->i_flags &= ~EXT4_KEEPSIZE_FL; } else { + /* + * Mark that we allocate beyond EOF so the subsequent truncate + * can proceed even if the new size is the same as i_size. + */ + inode->i_flags |= EXT4_KEEPSIZE_FL; } } --- .pc/fallocate_keepsizse.patch/fs/ext4/inode.c 2009-08-31 12:08:10.000000000 -0700 +++ fs/ext4/inode.c 2009-08-31 12:12:16.000000000 -0700 @@ -3973,6 +3973,8 @@ void ext4_truncate(struct inode *inode) if (!ext4_can_truncate(inode)) return; + inode->i_flags &= ~EXT4_KEEPSIZE_FL; + if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC)) ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE; @@ -4807,7 +4809,9 @@ int ext4_setattr(struct dentry *dentry, } if (S_ISREG(inode->i_mode) && - attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) { + attr->ia_valid & ATTR_SIZE && + (attr->ia_size < inode->i_size || + (inode->i_flags & EXT4_KEEPSIZE_FL))) { handle_t *handle; handle = ext4_journal_start(inode, 3); @@ -4838,6 +4842,11 @@ int ext4_setattr(struct dentry *dentry, goto err_out; } } + if ((inode->i_flags & EXT4_KEEPSIZE_FL)) { + rc = vmtruncate(inode, attr->ia_size); + if (rc) + goto err_out; + } } rc = inode_setattr(inode, attr); --- .pc/fallocate_keepsizse.patch/include/linux/fs.h 2009-08-31 12:08:10.000000000 -0700 +++ include/linux/fs.h 2009-08-31 12:12:16.000000000 -0700 @@ -343,6 +343,7 @@ struct inodes_stat_t { #define FS_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/ #define FS_EXTENT_FL 0x00080000 /* Extents */ #define FS_DIRECTIO_FL 0x00100000 /* Use direct i/o */ +#define FS_KEEPSIZE_FL 0x00200000 /* Blocks allocated beyond EOF */ #define FS_RESERVED_FL 0x80000000 /* reserved for ext2 lib */ #define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ --- .pc/fallocate_keepsizse.patch/fs/ext4/ext4.h 2009-08-31 12:08:10.000000000 -0700 +++ fs/ext4/ext4.h 2009-08-31 12:12:16.000000000 -0700 @@ -235,6 +235,7 @@ struct flex_groups { #define EXT4_HUGE_FILE_FL 0x00040000 /* Set to each huge file */ #define EXT4_EXTENTS_FL 0x00080000 /* Inode uses extents */ #define EXT4_EXT_MIGRATE 0x00100000 /* Inode is migrating */ +#define EXT4_KEEPSIZE_FL 0x00200000 /* Blocks allocated beyond EOF (bit reserved in fs.h) */ #define EXT4_RESERVED_FL 0x80000000 /* reserved for ext4 lib */ #define EXT4_FL_USER_VISIBLE 0x000BDFFF /* User visible flags */ > > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html