Re: [PATCH v2 2/7] iomap: Add zero unwritten mappings dio support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/2024 23:47, Darrick J. Wong wrote:
On Tue, Dec 10, 2024 at 12:57:32PM +0000, John Garry wrote:
For atomic writes support, it is required to only ever submit a single bio
(for an atomic write).

Furthermore, currently the atomic write unit min and max limit is fixed at
the FS block size.

For lifting the atomic write unit max limit, it may occur that an atomic
write spans mixed unwritten and mapped extents. For this case, due to the
iterative nature of iomap, multiple bios would be produced, which is
intolerable.

Add a function to zero unwritten extents in a certain range, which may be
used to ensure that unwritten extents are zeroed prior to issuing of an
atomic write.

I still dislike this.  IMO block untorn writes _is_ a niche feature for
programs that perform IO in large blocks.  Any program that wants a
general "apply all these updates or none of them" interface should use
XFS_IOC_EXCHANGE_RANGE since it has no awu_max restrictions, can handle
discontiguous update ranges, doesn't require block alignment, etc.

That is not a problem which I am trying to solve. Indeed atomic writes are for fixed blocks of data and not atomic file updates.

However, I still think that we should be able to atomic write mixed extents, even though it is a pain to implement. To that end, I could be convinced again that we don't require it...


Instead here we are adding a bunch of complexity, and not even all that
well:

Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
---
  fs/iomap/direct-io.c  | 76 +++++++++++++++++++++++++++++++++++++++++++
  include/linux/iomap.h |  3 ++
  2 files changed, 79 insertions(+)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 23fdad16e6a8..18c888f0c11f 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -805,6 +805,82 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
  }
  EXPORT_SYMBOL_GPL(iomap_dio_rw);
+static loff_t
+iomap_dio_zero_unwritten_iter(struct iomap_iter *iter, struct iomap_dio *dio)
+{
+	const struct iomap *iomap = &iter->iomap;
+	loff_t length = iomap_length(iter);
+	loff_t pos = iter->pos;
+
+	if (iomap->type == IOMAP_UNWRITTEN) {
+		int ret;
+
+		dio->flags |= IOMAP_DIO_UNWRITTEN;
+		ret = iomap_dio_zero(iter, dio, pos, length);

Shouldn't this be detecting the particular case that the mapping for the
kiocb is in mixed state and only zeroing in that case?  This just
targets every unwritten extent, even if the unwritten extent covered the
entire range that is being written.

Right, so I did touch on this in the final comment in patch 4/7 commit log.

Why I did it this way? I did not think that it made much difference, since this zeroing would be generally a one-off and did not merit even more complexity to implement.

It doesn't handle COW, it doesn't

Do we want to atomic write COW?

handle holes, etc.

I did test hole, and it seemed to work. However I noticed that for a hole region we get IOMAP_UNWRITTEN type, like:

# xfs_bmap -vvp /root/mnt/file
/root/mnt/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..127]:        192..319          0 (192..319)         128 000000
   1: [128..255]:      hole                                   128
   2: [256..383]:      448..575          0 (448..575)         128 000000
 FLAG Values:
    0100000 Shared extent
    0010000 Unwritten preallocated extent
    0001000 Doesn't begin on stripe unit
    0000100 Doesn't end   on stripe unit
    0000010 Doesn't begin on stripe width
    0000001 Doesn't end   on stripe width
#
#

For an atomic wrote of length 65536 @ offset 65536.

Any hint on how to create a iomap->type == IOMAP_HOLE?


Also, can you make a version of blkdev_issue_zeroout that returns the
bio so the caller can issue them asynchronously instead of opencoding
the bio_alloc loop in iomap_dev_zero?

ok, fine


+		if (ret)
+			return ret;
+	}
+
+	dio->size += length;
+
+	return length;
+}
+
+ssize_t
+iomap_dio_zero_unwritten(struct kiocb *iocb, struct iov_iter *iter,
+		const struct iomap_ops *ops, const struct iomap_dio_ops *dops)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct iomap_dio *dio;
+	ssize_t ret;
+	struct iomap_iter iomi = {
+		.inode		= inode,
+		.pos		= iocb->ki_pos,
+		.len		= iov_iter_count(iter),
+		.flags		= IOMAP_WRITE,

IOMAP_WRITE | IOMAP_DIRECT, no?

yes, I'll fix that.

And I should also set IOMAP_DIO_WRITE for dio->flags.

Thanks,
John




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux