Re: Files full of zeros with coreutils-8.11 and xfs (FIEMAP related?)

Pádraig Brady <P@xxxxxxxxxxxxxx> · Fri, 15 Apr 2011 00:29:46 +0100

On 14/04/11 23:59, Dave Chinner wrote:
> On Thu, Apr 14, 2011 at 10:50:10AM -0500, Eric Sandeen wrote:
>> On 4/14/11 9:59 AM, Pádraig Brady wrote:
>>> On 14/04/11 15:02, Markus Trippelsdorf wrote:
>>>>>> Hi Pádraig,
>>>>>>
>>>>>> here you go:
>>>>>> + filefrag -v unwritten.withdata                                                                                                                     
>>>>>> Filesystem type is: ef53                                                                                                                             
>>>>>> File size of unwritten.withdata is 5120 (2 blocks, blocksize 4096)                                                                                   
>>>>>>  ext logical physical expected length flags                                                                                                          
>>>>>>    0       0   274432            2560 unwritten,eof                                                                                                  
>>>>>> unwritten.withdata: 1 extent found
>>>>>>
>>>>>> Please notice that this also happens with ext4 on the same kernel. 
>>>>>> Btrfs is fine.
>>>>>
>>>> `filefrag -vs` fixes the issue on both xfs and ext4.
>>>
>>> So in summary, currently on (2.6.39-rc3), the following
>>> will (usually?) report a single unwritten extent,
>>> on both ext4 and xfs
>>>
>>>   fallocate -l 10MiB -n k
>>>   dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
>>>   filefrag -v k # grep for an extent without unwritten || fail
>>
>> right, that's what I see too in testing.
>>
>> But would the coreutils install have done a preallocation of the destination file?
>>
>> Otherwise this looks like a different bug...
>>
>>> This particular issue has been discussed so far at:
>>> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8411
>>> Note there it was stated there that ext4 had this
>>> fixed as of 2.6.39-rc1, so maybe there is something lurking?
>>
>> ext4 got a fix, but not xfs, I guess.  My poor brain can't remember, I think I started looking into it, but it's clearly still broken.
>>
>> Still, I don't know for sure what happened to Markus - did something preallocate, in his case?
> 
> Unwritten extent mapping behaves in an unexpected way due to
> buffered writeback not occurring immediately. Extent conversion
> doesn't occur until the data is on disk, and for buffered IO you
> need an fdatasync to ensure that has occurred.
> 
> That is: 
> 
> $ xfs_io -f -c "resvsp 0 10m" -c "pwrite 0 5120" -c "bmap -vp" /mnt/test/foo
> wrote 5120/5120 bytes at offset 0
> 5 KiB, 2 ops; 0.0000 sec (62.600 MiB/sec and 25641.0256 ops/sec)
> /mnt/test/foo:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>    0: [0..20479]:      268984..289463    0 (268984..289463) 20480 10000
> 
> Data has not been written yet, so it is still unwritten. The same
> test with a fsync shows:
> 
> $ sudo xfs_io -f -c "resvsp 0 10m" -c "pwrite 0 5120" -c fsync -c "bmap -vp" /mnt/test/foo
> wrote 5120/5120 bytes at offset 0
> 5 KiB, 2 ops; 0.0000 sec (87.193 MiB/sec and 35714.2857 ops/sec)
> /mnt/test/foo:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>    0: [0..15]:         268984..268999    0 (268984..268999)    16 00000
>    1: [16..20479]:     269000..289463    0 (269000..289463) 20464 10000
> 
> Everything is fine.
> 
> So this seems like an application error to me. If you are going to
> use fiemap to determine what ranges to copy, then you have to
> fdatasync the source file first to guarantee that preallocated
> extents have been converted to written state before mapping the
> file....

Well IMHO there should be a difference between
knowing where you are going to write, and actually writing to disk.
I.E. one shouldn't need to write the whole way to the device
before returning a valid fiemap.  If a particular file system
implementation needs to sync to return a valid fiemap,
then it should be implicit.

cheers,
Pádraig.

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs