This is a note to let you know that I've just added the patch titled xfs: reject all unaligned direct writes to reflinked files to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: xfs-reject-all-unaligned-direct-writes-to-reflinked-files.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. rom 54a4ef8af4e0dc5c983d17fcb9cf5fd25666d94e Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <hch@xxxxxx> Date: Mon, 6 Feb 2017 13:00:54 -0800 Subject: xfs: reject all unaligned direct writes to reflinked files From: Christoph Hellwig <hch@xxxxxx> commit 54a4ef8af4e0dc5c983d17fcb9cf5fd25666d94e upstream. We currently fall back from direct to buffered writes if we detect a remaining shared extent in the iomap_begin callback. But by the time iomap_begin is called for the potentially unaligned end block we might have already written most of the data to disk, which we'd now write again using buffered I/O. To avoid this reject all writes to reflinked files before starting I/O so that we are guaranteed to only write the data once. The alternative would be to unshare the unaligned start and/or end block before doing the I/O. I think that's doable, and will actually be required to support reflinks on DAX file system. But it will take a little more time and I'd rather get rid of the double write ASAP. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> [slight changes in context due to the new direct I/O code in 4.10+] Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/xfs/xfs_aops.c | 45 --------------------------------------------- fs/xfs/xfs_file.c | 9 +++++++++ fs/xfs/xfs_trace.h | 2 +- 3 files changed, 10 insertions(+), 46 deletions(-) --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1263,44 +1263,6 @@ xfs_map_trim_size( bh_result->b_size = mapping_size; } -/* Bounce unaligned directio writes to the page cache. */ -static int -xfs_bounce_unaligned_dio_write( - struct xfs_inode *ip, - xfs_fileoff_t offset_fsb, - struct xfs_bmbt_irec *imap) -{ - struct xfs_bmbt_irec irec; - xfs_fileoff_t delta; - bool shared; - bool x; - int error; - - irec = *imap; - if (offset_fsb > irec.br_startoff) { - delta = offset_fsb - irec.br_startoff; - irec.br_blockcount -= delta; - irec.br_startblock += delta; - irec.br_startoff = offset_fsb; - } - error = xfs_reflink_trim_around_shared(ip, &irec, &shared, &x); - if (error) - return error; - - /* - * We're here because we're trying to do a directio write to a - * region that isn't aligned to a filesystem block. If any part - * of the extent is shared, fall back to buffered mode to handle - * the RMW. This is done by returning -EREMCHG ("remote addr - * changed"), which is caught further up the call stack. - */ - if (shared) { - trace_xfs_reflink_bounce_dio_write(ip, imap); - return -EREMCHG; - } - return 0; -} - STATIC int __xfs_get_blocks( struct inode *inode, @@ -1438,13 +1400,6 @@ __xfs_get_blocks( if (imap.br_startblock != HOLESTARTBLOCK && imap.br_startblock != DELAYSTARTBLOCK && (create || !ISUNWRITTEN(&imap))) { - if (create && direct && !is_cow) { - error = xfs_bounce_unaligned_dio_write(ip, offset_fsb, - &imap); - if (error) - return error; - } - xfs_map_buffer(inode, bh_result, &imap, offset); if (ISUNWRITTEN(&imap)) set_buffer_unwritten(bh_result); --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -554,6 +554,15 @@ xfs_file_dio_aio_write( if ((iocb->ki_pos & mp->m_blockmask) || ((iocb->ki_pos + count) & mp->m_blockmask)) { unaligned_io = 1; + + /* + * We can't properly handle unaligned direct I/O to reflink + * files yet, as we can't unshare a partial block. + */ + if (xfs_is_reflink_inode(ip)) { + trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count); + return -EREMCHG; + } iolock = XFS_IOLOCK_EXCL; } else { iolock = XFS_IOLOCK_SHARED; --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3353,7 +3353,7 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_conv DEFINE_RW_EVENT(xfs_reflink_reserve_cow); DEFINE_RW_EVENT(xfs_reflink_allocate_cow_range); -DEFINE_INODE_IREC_EVENT(xfs_reflink_bounce_dio_write); +DEFINE_SIMPLE_IO_EVENT(xfs_reflink_bounce_dio_write); DEFINE_IOMAP_EVENT(xfs_reflink_find_cow_mapping); DEFINE_INODE_IREC_EVENT(xfs_reflink_trim_irec); Patches currently in stable-queue which might be from hch@xxxxxx are queue-4.9/xfs-mark-speculative-prealloc-cow-fork-extents-unwritten.patch queue-4.9/xfs-fix-toctou-race-when-locking-an-inode-to-access-the-data-map.patch queue-4.9/xfs-use-iomap-new-flag-for-newly-allocated-delalloc-blocks.patch queue-4.9/xfs-reject-all-unaligned-direct-writes-to-reflinked-files.patch queue-4.9/xfs-allow-unwritten-extents-in-the-cow-fork.patch queue-4.9/xfs-tune-down-agno-asserts-in-the-bmap-code.patch queue-4.9/xfs-verify-free-block-header-fields.patch queue-4.9/xfs-check-for-obviously-bad-level-values-in-the-bmbt-root.patch queue-4.9/xfs-don-t-fail-xfs_extent_busy-allocation.patch queue-4.9/xfs-sync-eofblocks-scans-under-iolock-are-livelock-prone.patch queue-4.9/xfs-use-per-ag-reservations-for-the-finobt.patch queue-4.9/xfs-pull-up-iolock-from-xfs_free_eofblocks.patch queue-4.9/xfs-fail-_dir_open-when-readahead-fails.patch queue-4.9/xfs-update-ctime-and-mtime-on-clone-destinatation-inodes.patch queue-4.9/xfs-only-update-mount-resv-fields-on-success-in-__xfs_ag_resv_init.patch queue-4.9/xfs-use-xfs_icluster_size_fsb-to-calculate-inode-chunk-alignment.patch queue-4.9/xfs-only-reclaim-unwritten-cow-extents-periodically.patch queue-4.9/xfs-try-any-ag-when-allocating-the-first-btree-block-when-reflinking.patch queue-4.9/xfs-fix-and-streamline-error-handling-in-xfs_end_io.patch queue-4.9/xfs-fix-eofblocks-race-with-file-extending-async-dio-writes.patch