Re: [PATCH RFC 06/10] xfs: iomap CoW-based atomic write support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/02/2025 20:05, Darrick J. Wong wrote:
On Tue, Feb 04, 2025 at 12:01:23PM +0000, John Garry wrote:
In cases of an atomic write occurs for misaligned or discontiguous disk
blocks, we will use a CoW-based method to issue the atomic write.

So, for that case, return -EAGAIN to request that the write be issued in
CoW atomic write mode. The dio write path should detect this, similar to
how misaligned regalar DIO writes are handled.

Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
---
  fs/xfs/xfs_iomap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++--
  1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ae3755ed00e6..2c2867d728e4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -809,9 +809,12 @@ xfs_direct_write_iomap_begin(
  	struct xfs_bmbt_irec	imap, cmap;
  	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
  	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
+	bool			atomic = flags & IOMAP_ATOMIC;
  	int			nimaps = 1, error = 0;
  	bool			shared = false;
+	bool			found = false;
  	u16			iomap_flags = 0;
+	bool			need_alloc;
  	unsigned int		lockmode;
  	u64			seq;
@@ -832,7 +835,7 @@ xfs_direct_write_iomap_begin(
  	 * COW writes may allocate delalloc space or convert unwritten COW
  	 * extents, so we need to make sure to take the lock exclusively here.
  	 */
-	if (xfs_is_cow_inode(ip))
+	if (xfs_is_cow_inode(ip) || atomic)
  		lockmode = XFS_ILOCK_EXCL;
  	else
  		lockmode = XFS_ILOCK_SHARED;
@@ -857,12 +860,73 @@ xfs_direct_write_iomap_begin(
  	if (error)
  		goto out_unlock;
+
+	if (flags & IOMAP_ATOMIC_COW) {
+		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
+				&lockmode,
+				(flags & IOMAP_DIRECT) || IS_DAX(inode), true);

Weird nit not relate to this patch: Is there ever a case where
IS_DAX(inode) and (flags & IOMAP_DAX) disagree?  I wonder if this odd
construction could be simplified to:

	(flags & (IOMAP_DIRECT | IOMAP_DAX))

I'm not sure. I assume that we always want to convert for DAX, and IOMAP_DAX may not be set always for DIO path - but I only see xfs_file_write_iter() -> xfs_file_dax_write() ->dax_iomap_rw(xfs_dax_write_iomap_ops), which sets IOMAP_DAX in iomap_iter.flags


?

+		if (error)
+			goto out_unlock;
+
+		end_fsb = imap.br_startoff + imap.br_blockcount;
+		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
+
+		if (imap.br_startblock != HOLESTARTBLOCK) {
+			seq = xfs_iomap_inode_sequence(ip, 0);
+
+			error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags,
+				iomap_flags | IOMAP_F_ATOMIC_COW, seq);
+			if (error)
+				goto out_unlock;
+		}
+		seq = xfs_iomap_inode_sequence(ip, 0);
+		xfs_iunlock(ip, lockmode);
+		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
+					iomap_flags | IOMAP_F_ATOMIC_COW, seq);
+	}

/me wonders if this should be a separate helper so that the main
xfs_direct_write_iomap_begin doesn't get even longer... but otherwise
the logic in here looks sane.

I can do that. Maybe some code can be factored out for regular "found cow path".


+
+	need_alloc = imap_needs_alloc(inode, flags, &imap, nimaps);
+
+	if (atomic) {
+		/* Use CoW-based method if any of the following fail */
+		error = -EAGAIN;
+
+		/*
+		 * Lazily use CoW-based method for initial alloc of data.
+		 * Check br_blockcount for FSes which do not support atomic
+		 * writes > 1x block.
+		 */
+		if (need_alloc && imap.br_blockcount > 1)
+			goto out_unlock;
+
+		/* Misaligned start block wrt size */
+		if (!IS_ALIGNED(imap.br_startblock, imap.br_blockcount))
+			goto out_unlock;
+
+		/* Discontiguous or mixed extents */
+		if (!imap_spans_range(&imap, offset_fsb, end_fsb))
+			goto out_unlock;
+	}

(Same two comments here.)

ok


+
  	if (imap_needs_cow(ip, flags, &imap, nimaps)) {
  		error = -EAGAIN;
  		if (flags & IOMAP_NOWAIT)
  			goto out_unlock;
+ if (atomic) {
+			/* Detect whether we're already covered in a cow fork */
+			error  = xfs_find_trim_cow_extent(ip, &imap, &cmap, &shared, &found);
+			if (error)
+				goto out_unlock;
+
+			if (shared) {
+				error = -EAGAIN;
+				goto out_unlock;

What is this checking?  That something else already created a mapping in
the COW fork, so we want to bail out to get rid of it?

I want to check if some data is shared. In that case, we should unshare.

And I am not sure if that check is sufficient.

On the buffered write path, we may have something in a CoW fork - in that case it should be flushed, right?

Thanks,
John




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux