Hi all, This is v2 of the speculative preallocation FAQ bits. The initial proposal was here: http://oss.sgi.com/archives/xfs/2014-03/msg00316.html This version includes some updates based on review from arekm and dchinner. Most notably, the content has been broken down into a few more questions. Unless there are further major changes required, I'll plan to post something along these lines to the wiki when my account is approved. Thanks for the feedback! Brian --- Q: Why do files on XFS use more data blocks than expected? A: The XFS speculative preallocation algorithm allocates extra blocks beyond end of file (EOF) to minimise file fragmentation during buffered write workloads. Workloads that benefit from this behaviour include slowly growing files, concurrent writers and mixed reader/writer workloads. It also provides fragmentation resistence in situations where memory pressure prevents adequate buffering of dirty data to allow formation of large contiguous regions of data in memory. This post-EOF block allocation is accounted identically to blocks within EOF. It is visible in 'st_blocks' counts via stat() system calls, accounted as globally allocated space and against quotas that apply to the associated file. The space is reported by various userspace utilities (stat, du, df, ls) and thus provides a common source of confusion for administrators. Post-EOF blocks are temporary in most situations and are usually reclaimed via several possible mechanisms in XFS. See the FAQ entry on speculative preallocation for details. Q: What is speculative preallocation? A: XFS speculatively preallocates post-EOF blocks on file extending writes in anticipation of future extending writes. The size of a preallocation is dynamic and depends on the runtime state of the file and fs. Generally speaking, preallocation is disabled for very small files and preallocation sizes grow as files grow larger. Preallocations are capped to the maximum extent size supported by the filesystem. Preallocation size is throttled automatically as the filesystem approaches low free space conditions or other allocation limits on a file (such as a quota). In most cases, speculative preallocation is automatically reclaimed when a file is closed. Preallocation may also persist beyond the lifecycle of the file descriptor. Certain application behaviors that are known to cause fragmentation, such as file server workloads, slowly growing files, etc., benefit from this and delay the removal of preallocated blocks beyond fd close. Q: How can I speed up or avoid delayed removal of speculative preallocation? A: Remove the inode from the VFS cache or unmount the filesystem to remove speculative preallocations associated with an inode. Linux 3.8 (and later) includes a scanner to perform background trimming of files with lingering post-EOF preallocations. The scanner bypasses dirty files to avoid interference with ongoing writes. A 5 minute scan interval is used by default and can be adjusted via the following file (value in seconds): /proc/sys/fs/xfs/speculative_prealloc_lifetime Q: Is speculative preallocation permanent? A: Although speculative preallocation can lead to reports of excess space usage, the preallocated space is not permanent unless explicitly made so via fallocate or a similar interface. Preallocated space can also be encoded permanently in situations where file size is extended beyond a range of post-EOF blocks (i.e., via truncate). Otherwise, preallocated blocks are reclaimed on file close, inode reclaim, unmount or in the background once file write activity subsides. Q: My workload has known characteristics - can I tune speculative preallocation to an optimal fixed size? A: The 'allocsize=' mount option configures the XFS block allocation algorithm to use a fixed allocation size. Speculative preallocation is not dynamically resized when the allocsize mount option is set and thus the potential for fragmentation is increased. XFS historically set allocsize to 64k by default. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs