On Nov 27, 2011, at 6:43 PM, Dave Chinner wrote: > fallocate() style (or non-delalloc, write syscall time) allocation > leads to non-optimal file layouts and slower writeback because the > location that blocks are allocated in no way matches the writeback > pattern, hence causing an increase in seeks during writeback of > large numbers of files. > > Further, filesytsems that are alignment aware (e.g. XFS) will align > every fallocate() based allocation, greatly fragmenting free space > when used on small files and the filesystem is on a RAID array. > However, in XFS, delayed allocation will actually pack the > allocation across files tightly on disk, resulting in full stripe > writes (even for sub-stripe unit/width files) during write back. Well, the question is whether you're optimizing for writing the files, or reading the files. In some cases, files are write once, read never (well, almost never) --- i.e., the backup case. In other cases, the files are write once, read many --- i.e., when installing software. In that case, optimizing for the file reading might mean that you want to make the files aligned on RAID stripes, although it will fragment free space. It all depends on what you're optimizing for. I didn't realize that XFS was not aligning to RAID stripes when doing delayed allocation writes. I'm curious --- does it do this only when there are multiple files outstanding for delayed allocation in an allocation group? If someone does a singleton cp of a large file without using fallocate, will XFS try to align the write? Also, if we are going to use fallocate() as a way of implicitly signaling to the file system that the file should be optimized for reads, as opposed to the write, maybe we should explicitly document it as such in the fallocate(2) man page, so that application programmers understand that this is the semantics they should expect. -- Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html