This is a note to let you know that I've just added the patch titled xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: xfs-track-preallocation-separately-in-xfs_bmapi_reserve_delalloc.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From hch@xxxxxx Tue Jan 10 11:25:03 2017 From: Christoph Hellwig <hch@xxxxxx> Date: Mon, 9 Jan 2017 16:38:43 +0100 Subject: xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() To: stable@xxxxxxxxxxxxxxx Cc: linux-xfs@xxxxxxxxxxxxxxx, Brian Foster <bfoster@xxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx> Message-ID: <1483976343-661-13-git-send-email-hch@xxxxxx> From: Brian Foster <bfoster@xxxxxxxxxx> commit 974ae922efd93b07b6cdf989ae959883f6f05fd8 upstream. Speculative preallocation is currently processed entirely by the callers of xfs_bmapi_reserve_delalloc(). The caller determines how much preallocation to include, adjusts the extent length and passes down the resulting request. While this works fine for post-eof speculative preallocation, it is not as reliable for COW fork preallocation. COW fork preallocation is implemented via the cowextszhint, which aligns the start offset as well as the length of the extent. Further, it is difficult for the caller to accurately identify when preallocation occurs because the returned extent could have been merged with neighboring extents in the fork. To simplify this situation and facilitate further COW fork preallocation enhancements, update xfs_bmapi_reserve_delalloc() to take a separate preallocation parameter to incorporate into the allocation request. The preallocation blocks value is tacked onto the end of the request and adjusted to accommodate neighboring extents and extent size limits. Since xfs_bmapi_reserve_delalloc() now knows precisely how much preallocation was included in the allocation, it can also tag the inodes appropriately to support preallocation reclaim. Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to use the preallocation mechanism. This patch should not change behavior outside of correctly tagging reflink inodes when start offset preallocation occurs (which the caller does not handle correctly). Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/xfs/libxfs/xfs_bmap.c | 23 +++++++++++++++++++++-- fs/xfs/libxfs/xfs_bmap.h | 2 +- fs/xfs/xfs_iomap.c | 2 +- fs/xfs/xfs_reflink.c | 2 +- 4 files changed, 24 insertions(+), 5 deletions(-) --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -50,6 +50,7 @@ #include "xfs_ag_resv.h" #include "xfs_refcount.h" #include "xfs_rmap_btree.h" +#include "xfs_icache.h" kmem_zone_t *xfs_bmap_free_item_zone; @@ -4247,8 +4248,9 @@ int xfs_bmapi_reserve_delalloc( struct xfs_inode *ip, int whichfork, - xfs_fileoff_t aoff, + xfs_fileoff_t off, xfs_filblks_t len, + xfs_filblks_t prealloc, struct xfs_bmbt_irec *got, xfs_extnum_t *lastx, int eof) @@ -4260,10 +4262,17 @@ xfs_bmapi_reserve_delalloc( char rt = XFS_IS_REALTIME_INODE(ip); xfs_extlen_t extsz; int error; + xfs_fileoff_t aoff = off; - alen = XFS_FILBLKS_MIN(len, MAXEXTLEN); + /* + * Cap the alloc length. Keep track of prealloc so we know whether to + * tag the inode before we return. + */ + alen = XFS_FILBLKS_MIN(len + prealloc, MAXEXTLEN); if (!eof) alen = XFS_FILBLKS_MIN(alen, got->br_startoff - aoff); + if (prealloc && alen >= len) + prealloc = alen - len; /* Figure out the extent size, adjust alen */ if (whichfork == XFS_COW_FORK) @@ -4329,6 +4338,16 @@ xfs_bmapi_reserve_delalloc( */ xfs_bmbt_get_all(xfs_iext_get_ext(ifp, *lastx), got); + /* + * Tag the inode if blocks were preallocated. Note that COW fork + * preallocation can occur at the start or end of the extent, even when + * prealloc == 0, so we must also check the aligned offset and length. + */ + if (whichfork == XFS_DATA_FORK && prealloc) + xfs_inode_set_eofblocks_tag(ip); + if (whichfork == XFS_COW_FORK && (prealloc || aoff < off || alen > len)) + xfs_inode_set_cowblocks_tag(ip); + ASSERT(got->br_startoff <= aoff); ASSERT(got->br_startoff + got->br_blockcount >= aoff + alen); ASSERT(isnullstartblock(got->br_startblock)); --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -242,7 +242,7 @@ struct xfs_bmbt_rec_host * int fork, int *eofp, xfs_extnum_t *lastxp, struct xfs_bmbt_irec *gotp, struct xfs_bmbt_irec *prevp); int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork, - xfs_fileoff_t aoff, xfs_filblks_t len, + xfs_fileoff_t off, xfs_filblks_t len, xfs_filblks_t prealloc, struct xfs_bmbt_irec *got, xfs_extnum_t *lastx, int eof); enum xfs_bmap_intent_type { --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -622,7 +622,7 @@ xfs_file_iomap_begin_delay( retry: error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb, - end_fsb - offset_fsb, &got, &idx, eof); + end_fsb - offset_fsb, 0, &got, &idx, eof); switch (error) { case 0: break; --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -293,7 +293,7 @@ xfs_reflink_reserve_cow( retry: error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff, - end_fsb - imap->br_startoff, &got, &idx, eof); + end_fsb - imap->br_startoff, 0, &got, &idx, eof); switch (error) { case 0: break; Patches currently in stable-queue which might be from hch@xxxxxx are queue-4.9/xfs-always-succeed-when-deduping-zero-bytes.patch queue-4.9/xfs-fix-crash-and-data-corruption-due-to-removal-of-busy-cow-extents.patch queue-4.9/xfs-don-t-allow-di_size-with-high-bit-set.patch queue-4.9/xfs-new-inode-extent-list-lookup-helpers.patch queue-4.9/xfs-don-t-call-xfs_sb_quota_from_disk-twice.patch queue-4.9/xfs-factor-rmap-btree-size-into-the-indlen-calculations.patch queue-4.9/xfs-check-return-value-of-_trans_reserve_quota_nblks.patch queue-4.9/xfs-complain-if-we-don-t-get-nextents-bmap-records.patch queue-4.9/xfs-check-for-bogus-values-in-btree-block-headers.patch queue-4.9/xfs-use-gpf_nofs-when-allocating-btree-cursors.patch queue-4.9/xfs-fix-max_retries-_show-and-_store-functions.patch queue-4.9/xfs-fix-double-cleanup-when-cui-recovery-fails.patch queue-4.9/xfs-don-t-skip-cow-forks-w-delalloc-blocks-in-cowblocks-scan.patch queue-4.9/xfs-track-preallocation-separately-in-xfs_bmapi_reserve_delalloc.patch queue-4.9/xfs-use-the-actual-ag-length-when-reserving-blocks.patch queue-4.9/xfs-ignore-leaf-attr-ichdr.count-in-verifier-during-log-replay.patch queue-4.9/xfs-pass-post-eof-speculative-prealloc-blocks-to-bmapi.patch queue-4.9/xfs-don-t-cap-maximum-dedupe-request-length.patch queue-4.9/xfs-pass-state-not-whichfork-to-trace_xfs_extlist.patch queue-4.9/xfs-move-agi-buffer-type-setting-to-xfs_read_agi.patch queue-4.9/xfs-check-minimum-block-size-for-crc-filesystems.patch queue-4.9/xfs-handle-cow-fork-in-xfs_bmap_trace_exlist.patch queue-4.9/pci-msi-check-for-null-affinity-mask-in-pci_irq_get_affinity.patch queue-4.9/xfs-error-out-if-trying-to-add-attrs-and-anextents-0.patch queue-4.9/xfs-don-t-bug-on-mixed-direct-and-mapped-i-o.patch queue-4.9/xfs-use-new-extent-lookup-helpers-xfs_file_iomap_begin_delay.patch queue-4.9/xfs-fix-unbalanced-inode-reclaim-flush-locking.patch queue-4.9/genirq-affinity-fix-node-generation-from-cpumask.patch queue-4.9/xfs-use-new-extent-lookup-helpers-in-__xfs_reflink_reserve_cow.patch queue-4.9/xfs-don-t-crash-if-reading-a-directory-results-in-an-unexpected-hole.patch queue-4.9/xfs-remove-prev-argument-to-xfs_bmapi_reserve_delalloc.patch queue-4.9/xfs-clean-up-cow-fork-reservation-and-tag-inodes-correctly.patch queue-4.9/xfs-forbid-ag-btrees-with-level-0.patch queue-4.9/xfs-provide-helper-for-counting-extents-from-if_bytes.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html