On 2016-03-31 11:31, Andreas Dilger wrote:
On Mar 31, 2016, at 1:55 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
On Wed, Mar 30, 2016 at 05:32:42PM -0700, Liu Bo wrote:
Well, btrfs fallocate doesn't allocate space if it's a shared one
because it thinks the space is already allocated. So a later overwrite
over this shared extent may hit enospc errors.
And this makes it an incorrect implementation of posix_fallocate,
which glibcs implements using fallocate if available.
It isn't really useful for a COW filesystem to implement fallocate()
to reserve blocks. Even if it did allocate all of the blocks on the
initial fallocate() call, when it comes time to overwrite these blocks
new blocks need to be allocated as the old ones will not be overwritten.
Because of snapshots that could hold references to the old blocks,
there isn't even the guarantee that the previous fallocated blocks will
be released in a reasonable time to free up an equal amount of space.
That really depends on how it's done. AFAIK, unwritten extents on BTRFS
are block reservations which make sure that you can write there (IOW,
the unwritten extent gets converted to a regular extent in-place, not
via COW). This means that it is possible to guarantee that the first
write to that area will work, which is technically all that POSIX
requires. This in turn means that stuff like SystemD and RDBMS software
don't exactly see things working as they expect them too, but that's
because they make assumptions based on existing technology.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html