Re: fallocate mode flag for "unshare blocks"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2016-03-31 03:58, Christoph Hellwig wrote:
On Wed, Mar 30, 2016 at 02:58:38PM -0400, Austin S. Hemmelgarn wrote:
Nothing that I can find in the man-pages or API documentation for Linux's
fallocate explicitly says that it will be fast.  There are bits that say it
should be efficient, but that is not itself well defined (given context, I
would assume it to mean that it doesn't use as much I/O as writing out that
many bytes of zero data, not necessarily that it will return quickly).

And that's pretty much as narrow as an defintion we get.  But apparently
gfs2 already breaks that expectation :(
GFS2 breaks other expectations as well (mostly stuff with locking) in arguably more significant ways, so I would not personally consider it to be precedent for breaking this on other filesystems.

delalloc system is careful enough to check that there are enough free
blocks to handle both the allocation and the metadata updates.  The
only gap in this scheme that I can see is if we fallocate, crash, and
upon restart the program then tries to write without retrying the
fallocate.  Can we trade some performance for the added requirement
that we must fallocate -> write -> fsync, and retry the trio if we
crash before the fsync returns?  I think that's already an implicit
requirement, so we might be ok here.
Most of the software I've seen that doesn't use fallocate like this is
either doing odd things otherwise, or is just making sure it has space for
temporary files, so I think it is probably safe to require this.

posix_fallocate gurantees you that you don't get ENOSPC from the write,
and there is plenty of software relying on that or crashing / cause data
integrity problems that way.

posix_fallocate is not the same thing as the fallocate syscall. It's there for compatibility, it has less functionality, and most importantly, it _can_ be slow (because at least glibc will emulate it if the underlying FS doesn't support fallocate, which means it's no faster than just using dd).
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux