On Mon, Nov 28, 2011 at 08:55:02AM +0000, Pádraig Brady wrote: > On 11/28/2011 05:10 AM, Dave Chinner wrote: > > Quite frankly, if system utilities like cp and tar start to abuse > > fallocate() by default so they can get "upfront ENOSPC detection", > > then I will seriously consider making XFS use delayed allocation for > > fallocate rather than unwritten extents so we don't lose the past 15 > > years worth of IO and aging optimisations that delayed allocation > > provides us with.... > > For the record I was considering fallocate() for these reasons. > > 1. Improved file layout for subsequent access > 2. Immediate indication of ENOSPC > 3. Efficient writing of NUL portions > > You lucidly detailed issues with 1. which I suppose could be somewhat > mitigated by not fallocating < say 1MB, though I suppose file systems > could be smarter here and not preallocate small chunks (or when > otherwise not appropriate). When you consider that some high end filesystem deployments have alignment characteristics over 50MB (e.g. so each uncompressed 4k resolution video frame is located on a different set of non-overlapping disks), arbitrary "don't fallocate below this amount" heuristics will always have unforseen failure cases... In short: leave optimising general allocation strategies to the filesytems and their developers - there is no One True Solution for optimal file layout in a given filesystem, let alone across different filesytems. In fact, I don't even want to think about the mess fallocate() on everything would make of btrfs because of it's COW structure - it seems to me to guarantee worse fragmentation than using delayed allocation... > We can already get ENOSPC from a write() > after an fallocate() in certain edge cases, so it would probably make > sense to expand those cases. fallocate is for preallocation, not for ENOSPC detection. If you want efficient and effective ENOSPC detection before writing anything, then you really want a space -reservation- extension to fallocate. Filesystems that use delayed allocation already have a space reservation subsystem - it how they account for space that is reserved by delayed allocation prior to the real allocation being done. IMO, allowing userspace some level of access to those reservations would be more appropriate for early detection of ENOSPC than using preallocation for everything... As to efficient writing of NULL ranges - that's what sparse files are for - you do not need to write or even preallocate NULL ranges when copying files. Indeed, the most efficient way of dealing with NULL ranges is to punch a hole and let the filesystem deal with it..... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html