On Jul 21, 2008 08:34 -0400, Theodore Ts'o wrote: > Wel, as I said originally, we have four choices, only two of which are > tenable: > > 1) Don't change i_size and leave e2fsck confused about whether i_size > is confused or not; the next time e2fsck runs it can either fix it and > change i_size, confusing applications that depend on i_size, or not > fix it and in the case of a corrupted i_size, leave valid data > inaccessible or do the hack to which Andreas reacted, "Yuck", and > which Annesh quoted and I assume agree. (i.e., checking the data > blocks to see if they are non-zero, and electing to to risk confusing > the application in the case where they are non-zero). This is the > current case. > > 2) Change i_size and always confuse applications that depend on i_size > carrying some semantic meaning. > > 3) Don't aggressively zero-out (as it presents us with these two > untenable options) and try to explit the extent instead. If the block > application fails, return ENOSPC. > > 4) #3, except if the block allocation fails, try to steal a block that > had been previously preallocated for some other logical block in that > inode. 5) Add a flag to the inode which means "blocks beyond i_size" if fallocate() is called with "KEEP_SIZE" and allocation is actually beyond i_size and not just filling a hole) so that e2fsck won't "fix" the size, but allows the extent to be uninitialized. The flag is cleared (by kernel and/or e2fsck) if the size is extended to the last block. To avoid consuming our precious inode flags, we might consider to re-use the EXT3_DIRSYNC_FL or EXT3_TOPDIR_FL for this purpose, since the are definitely only having meaning for directories. I guess the question is whether we would need this for directories, but I don't think so as we could always just add empty directory blocks (at the expense of having to scan them later). > The one other thing I would note is that at least for non-root users, > the reserved blocks will help save us most of the time, except for > when users explicitly set the reserved blocks down to zero. Would the index block be allocated from the reserved space tough? This is also a good idea, but I'm not sure if that is what happens. I guess the "allocate index block" code path needs to check for "(uid == s_reserved_uid || is_metadata)"? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html