Re: [PATCH] ext4: use vmtruncate() instead of ext4_truncate() in ext4_setattr()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 17, 2011 at 10:19:05PM -0500, Eric Sandeen wrote:
> On 5/17/11 5:59 PM, Jiaying Zhang wrote:
> > There is a bug in commit c8d46e41 "ext4: Add flag to files with blocks
> > intentionally past EOF" that if we fallocate a file with FALLOC_FL_KEEP_SIZE
> > flag and then ftruncate the file to a size larger than the file's i_size,
> > any allocated but unwritten blocks will be freed but the file size is set
> > to the size that ftruncate specifies.
> > 
> > Here is a simple test to reproduce the problem:
> >   1. fallocate a 12k size file with KEEP_SIZE flag
> >   2. write the first 4k
> >   3. ftruncate the file to 8k
> > Then 'ls -l' shows that the i_size of the file becomes 8k but debugfs
> > shows the file has only the first written block left.
> 
> To be honest I'm not 100% certain what the fiesystem -should- do in this case.
> 
> If I go through that same sequence on xfs, I get 4k written / 8k unwritten:
> 
> # xfs_bmap -vp testfile
> testfile:
>  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET              TOTAL FLAGS
>    0: [0..7]:          2648750760..2648750767  3 (356066400..356066407)     8 00000
>    1: [8..23]:         2648750768..2648750783  3 (356066408..356066423)    16 10000

Ok, so that's the case for a _truncate up_ from 4k to 8k:

$ rm /mnt/test/foo
$ xfs_io -f -c "resvsp 0 12k" -c stat -c "bmap -vp" -c "pwrite 0 4k" -c "fsync" -c "bmap -vp" -c "t 8k" -c "bmap -vp" -c stat /mnt/test/foo
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 0
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 1
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..23]:         9712..9735        0 (9712..9735)        24 10000
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0000 sec (156 MiB/sec and 40000.0000 ops/sec)
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9712..9719        0 (9712..9719)         8 00000
   1: [8..23]:         9720..9735        0 (9720..9735)        16 10000
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9712..9719        0 (9712..9719)         8 00000
   1: [8..23]:         9720..9735        0 (9720..9735)        16 10000
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 8192
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 2
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136

But you get a different result on truncate down:

$rm /mnt/test/foo
$ xfs_io -f -c "truncate 12k" -c "resvsp 0 12k" -c stat -c "bmap -vp" -c "pwrite 0 4k" -c "fsync" -c "bmap -vp" -c "t 8k" -c "bmap -vp" -c stat /mnt/test/foo
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 12288
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 1
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..23]:         9584..9607        0 (9584..9607)        24 10000
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0000 sec (217.014 MiB/sec and 55555.5556 ops/sec)
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9584..9591        0 (9584..9591)         8 00000
   1: [8..23]:         9592..9607        0 (9592..9607)        16 10000
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9584..9591        0 (9584..9591)         8 00000
   1: [8..15]:         9592..9599        0 (9592..9599)         8 10000
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 8192
stat.blocks = 16
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 2
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136

IOWs, on XFS a truncate up does not change the preallocation at all,
while a truncate down will _always_ remove preallocation beyond the
new EOF.  It's always had this behaviour w.r.t. to truncate(2) and
preallocation beyond EOF.

> I think this is a different result from ext4, either with or without your patch.
> 
> On ext4 I get size 8k, but only the first 4k mapped, as you say.
> 
> I don't recall when truncate is supposed to free fallocated blocks, and from what point?

It's entirely up to the filesystem how it treats blocks beyond EOF
during truncation. XFS frees them on truncate down, because it is
much safer to just truncate away everything beyond the new EOF than
to leave written extents beyond EOF as potential landmines.

Indeed, that's why calling vmtruncate() as a bad fix. If you have:


	       NUUUUUUUUUUWWWWWWWWWOUUUUUUUUU
       ....----+----------+--------+--------+
               A	  B        C        D

Where	A = new EOF (N)
	A->B = unwritten (U)
	B->C = written (W)
	C = old EOF (O)
	C->D = unwritten (U)

Then just calling vmtruncate() will leave the blocks in the range
B->C as written blocks. Hence then doing an extending truncate back
out to D will expose stale data rather than zeros in the range
B->C....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux