Re: Question on fallocate/ftruncate sequence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 28, 2009 at 3:14 PM, Andreas Dilger<adilger@xxxxxxx> wrote:
> On Aug 28, 2009  14:44 -0700, Jiaying Zhang wrote:
>> On Fri, Aug 28, 2009 at 12:40 PM, Andreas Dilger<adilger@xxxxxxx> wrote:
>> > This isn't really correct, however, because i_blocks also contains
>> > non-data blocks (indirect/index, EA, etc) blocks, so even with small
>> > files with ACLs i_blocks may always be larger than ia_size >> 9, and
>> > for ext2/3 at least this will ALWAYS be true for files > 48kB in size.
>>
>> I see. I guess we need to use a special flag then. Or is there any
>> other suggestions? I also have another question related to this
>> problem. Why those fallocated blocks are not marked as preallocated
>> blocks that will then be automatically freed in ext4_release_file?
>
> Because fallocate() means "persistent allocation on disk", not "in memory
> preallocation".  The "in memory" preallocation already happens in ext4,
> and it is released when the inode is cleaned up.

Right. Thanks for pointing this out!

RFC, here is a patch that Frank and I have been working on. It introduces
a new fs flag to mark that the file has been allocated beyond its EOF, as
discussed previously in this thread. The flag is cleared in the subsequent
vmtruncate or fallocate without KEEPSIZE. It is possible that a vmtruncate
may be called unnecessarily in the case that the file is written beyond the
allocated size, but I think it is ok to pay this cost to get correctness.

--- .pc/fallocate_keepsizse.patch/fs/attr.c	2009-08-28 15:38:46.000000000 -0700
+++ fs/attr.c	2009-08-28 17:01:04.000000000 -0700
@@ -68,7 +68,8 @@ int inode_setattr(struct inode * inode,
 	unsigned int ia_valid = attr->ia_valid;

 	if (ia_valid & ATTR_SIZE &&
-	    (attr->ia_size != i_size_read(inode)) {
+	    (attr->ia_size != i_size_read(inode) ||
+	     (inode->i_flags & FS_KEEPSIZE_FL))) {
 		int error = vmtruncate(inode, attr->ia_size);
 		if (error)
 			return error;
--- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c	2009-08-28
15:37:45.000000000 -0700
+++ fs/ext4/extents.c	2009-08-28 17:27:27.000000000 -0700
@@ -3095,7 +3095,13 @@ static void ext4_falloc_update_inode(str
 			i_size_write(inode, new_size);
 		if (new_size > EXT4_I(inode)->i_disksize)
 			ext4_update_i_disksize(inode, new_size);
+		inode->i_flags &= ~FS_KEEPSIZE_FL;
 	} else {
+		/*
+		 * Mark that we allocate beyond EOF so the subsequent truncate
+		 * can proceed even if the new size is the same as i_size.
+		 */
+		inode->i_flags |= FS_KEEPSIZE_FL;
 	}
 }

--- .pc/fallocate_keepsizse.patch/fs/ext4/inode.c	2009-08-16
14:19:38.000000000 -0700
+++ fs/ext4/inode.c	2009-08-28 16:59:42.000000000 -0700
@@ -3973,6 +3973,8 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;

+	inode->i_flags &= ~FS_KEEPSIZE_FL;
+
 	if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC))
 		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;

--- .pc/fallocate_keepsizse.patch/include/linux/fs.h	2009-08-28
15:44:27.000000000 -0700
+++ include/linux/fs.h	2009-08-28 17:00:47.000000000 -0700
@@ -343,6 +343,7 @@ struct inodes_stat_t {
 #define FS_TOPDIR_FL			0x00020000 /* Top of directory hierarchies*/
 #define FS_EXTENT_FL			0x00080000 /* Extents */
 #define FS_DIRECTIO_FL			0x00100000 /* Use direct i/o */
+#define FS_KEEPSIZE_FL			0x00200000 /* Blocks allocated beyond EOF */
 #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */

 #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */

Jiaying

>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux