Re: fallocate mode flag for "unshare blocks"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2016-03-31 11:31, Andreas Dilger wrote:
On Mar 31, 2016, at 1:55 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

On Wed, Mar 30, 2016 at 05:32:42PM -0700, Liu Bo wrote:
Well, btrfs fallocate doesn't allocate space if it's a shared one
because it thinks the space is already allocated.  So a later overwrite
over this shared extent may hit enospc errors.

And this makes it an incorrect implementation of posix_fallocate,
which glibcs implements using fallocate if available.

It isn't really useful for a COW filesystem to implement fallocate()
to reserve blocks.  Even if it did allocate all of the blocks on the
initial fallocate() call, when it comes time to overwrite these blocks
new blocks need to be allocated as the old ones will not be overwritten.

Because of snapshots that could hold references to the old blocks,
there isn't even the guarantee that the previous fallocated blocks will
be released in a reasonable time to free up an equal amount of space.

That really depends on how it's done. AFAIK, unwritten extents on BTRFS are block reservations which make sure that you can write there (IOW, the unwritten extent gets converted to a regular extent in-place, not via COW). This means that it is possible to guarantee that the first write to that area will work, which is technically all that POSIX requires. This in turn means that stuff like SystemD and RDBMS software don't exactly see things working as they expect them too, but that's because they make assumptions based on existing technology.

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux