Hi Dave. Thanks for the explanation!! > > By my interpretation, I'd say that if the fallocate fails with -ENOSPC, no > > allocation should be done at all, and the file size should not be changed, > > once we can't guarantee the writes will not fail due lack of disk space. > > preallocation can fail half way through, but users can't see that so > there's no real requirement to remove partial allocations on > failure. > I think this is partially true. Users can't see the change in the file size, but in some situations, if the pre-allocation are beyond EOF, the blocks might actually be pre-allocated to the file, making that space unavailable, at least until some other operation triggers xfs_free_eofblocks(). I am still reading the code to understand why in some situations, we end up with blocks allocated beyond EOF, and in another we end up with no pre-allocated blocks at all. I suppose it depends on how many extents and (their size) are being pre-allocated. So if the first unwritten extent already fails, we end up with no blocks, but if a few succeed, then we will end up with these blocks pre-allocated. While I don't think this is a filesystem problem, the fallocate man page is quite not clear about what users of fallocate() should do in case it get an ENOSPC error, so, maybe something like: "User is supposed to clean up partial allocations in case of ENOSPC", might make it a bit clear. > Indeed, rollback of partial allocations makes error handling > ridiculously complex - think about an allocation across a range of a > file that has alternating holes and unwritten extents. Half way > through we get ENOSPC, and now we have to go punch out the extents > we've already preallocated. Unless we record every single extent we > allocate in a preallocation, there's no way we can return the file > to it's previous state. We could be allocating thousands of extents > in a single syscall. Hence rollback on failure is pretty much doomed > on all existing filesystems, so it's not required. > Makes sense. > > However, what I see is a different behavior for different filesystems, for > > instance, if the file already has some blocks allocated, Ext4 will leave the > > file with a partial pre-allocation made by fallocate, > > So will XFS. You just need to fragment freespace or preallocate > over a sparse range with some blocks allocated so that the ENOSPC > occurs on the second+ extent that is allocated in the range given. > > > while XFS does not change > > file size of add any extra blocks to the file at all. > > Right, XFS will not change the file size unless the preallocation > operation completes successfully. > > > If the original file size is 0, it changes a bit, Ext4 will still change the > > file size and leave the partially allocated blocks there, while XFS won't change > > the file size, but will keep the partially pre-allocated blocks. > > I think ext4 shouldn't be changing the file size in this case. Write > a fstest to trigger this enospc behaviour and have it fail if the > file size changes on a preallocation that returns ENOSPC.... > > > I wonder how it should really behave or if it is a filesystem decision? > > Should be consistent across all filesystems, hence the fstest... Yup, the fstest is interesting. Thanks! > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx -- Carlos