On Wed, Sep 2, 2009 at 10:20 PM, Jiaying Zhang<jiayingz@xxxxxxxxxx> wrote: > On Wed, Sep 2, 2009 at 1:41 AM, Andreas Dilger<adilger@xxxxxxx> wrote: >> On Aug 31, 2009 16:33 -0700, Jiaying Zhang wrote: >>> > EXT4_KEEPSIZE_FL should only be cleared if there were writes to >>> > the end of the fallocated space. In that regard, I think the name >>> > of this flag should be changed to something like "EXT4_EOFBLOCKS_FL" >>> > to indicate that blocks are allocated beyond the end of file (i_size). >>> >>> Thanks for catching this! I changed the patch to only clear the flag >>> when the new_size is larger than i_size and changed the flag name >>> as you suggested. It would be nice if we only clear the flag when we >>> write beyond the fallocated space, but this seems hard to detect >>> because we no longer have the allocated size once that keepsize >>> fallocate call returns. >> >> The problem is that if e2fsck depends on the EXT4_EOFBLOCKS_FL set >> for fallocate-beyond-EOF then it is worse to clear it than to leave >> it set. At worst, leaving the flag set results in too many truncates >> on the file. Clearing the flag when not correct may result in user >> visible data corruption if the file size is extended... >> >>> Here is the new patch: >>> >>> --- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c 2009-08-31 >>> 12:08:10.000000000 -0700 >>> +++ fs/ext4/extents.c 2009-08-31 15:51:13.000000000 -0700 >>> @@ -3091,11 +3091,19 @@ static void ext4_falloc_update_inode(str >>> * the file size. >>> */ >>> if (!(mode & FALLOC_FL_KEEP_SIZE)) { >>> + if (new_size > i_size_read(inode)) { >>> i_size_write(inode, new_size); >>> + inode->i_flags &= ~EXT4_EOFBLOCKS_FL; >> >> This again isn't quite correct, since the EOFBLOCKS_FL shouldn't >> be cleared unless new_size is beyond the allocated size. The >> allocation code itself might be a better place to clear this, >> since it knows whether there were new blocks being added beyond >> the current max allocated block. > > We were thinking to clear this flag when we need to allocate new > blocks, but I was not sure how to get the current max allocated > block -- that is mostly because I just started working on the ext4 > code. After digging into the ext4 allocation code today, I think we > can put the check&clear in ext4_ext_get_blocks: > > @@ -2968,6 +2968,14 @@ int ext4_ext_get_blocks(handle_t *handle > newex.ee_len = cpu_to_le16(ar.len); > if (create == EXT4_CREATE_UNINITIALIZED_EXT) /* Mark uninitialized */ > ext4_ext_mark_uninitialized(&newex); > + > + if (unlikely(inode->i_flags & EXT4_EOFBLOCKS_FL)) { > + BUG_ON(!eh->eh_entries); > + last_ex = EXT_LAST_EXTENT(eh); > + if (iblock + max_blocks > le32_to_cpu(last_ex->ee_block) > + + ext4_ext_get_actual_len(last_ex)) > + inode->i_flags &= ~EXT4_EOFBLOCKS_FL; > + } > err = ext4_ext_insert_extent(handle, inode, path, &newex); > if (err) { > /* free data blocks we just allocated */ > > Again, I just started looking at this part of code, so please let me know > if I am in the right direction. > > Another thing I am not sure is whether we can allocate a non-data block, > like extended attributes, beyond the current max block without changing > the i_size. In that case, clearing the EOFBLOCKS flag will be wrong. > >>> #define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ >> >> It probably isn't a bad idea to make this flag user-visible, since it >> would allow scanning for files that have excess space reserved (e.g. >> if the filesystem is getting full). Making it user-settable (i.e. >> clearable) would essentially mean truncating the file to i_size without >> updating the timestamps so that the reserved space is discarded. I >> don't think there is any value in allowing a user to turn this flag on >> for a file. > > So to make it user-settable, we need to add the handling in ext4_ioctl > that calls vmtruncate when the flag to be cleared. But how can we get > the right size to truncate in that case? Can we just set that to the > max initialized block shift with block size? But that may also truncate > the blocks reserved without the KEEP_SIZE flag. Never mind, that is a stupid question. We can just truncate to the current i_size. Jiaying > > Jiaying > >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Sr. Staff Engineer, Lustre Group >> Sun Microsystems of Canada, Inc. >> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html