On Fri, Jun 14, 2024 at 06:41:55AM +0200, Christoph Hellwig wrote: > On Thu, Jun 13, 2024 at 09:13:10PM -0700, Darrick J. Wong wrote: > > > Looking at the caller below I don't think we need the use_cowextszhint > > > flag here, we can just locally check for prealloc beeing non-0 in > > > the branch below: > > > > That won't work, because xfs_buffered_write_iomap_begin only sets > > @prealloc to nonzero if it thinks is an extending write. For the cow > > fork, we create delalloc reservations that are aligned to the cowextsize > > value for overwrites below eof. > > Yeah. For that we'd need to move the retry loop into > xfs_bmapi_reserve_delalloc - which honestly feels like the more logical > place for it anyway. As in the untested version below, also note > my XXX comment about a comment being added: > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index c101cf266bc4db..58ff21cb84e0f5 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -4059,19 +4059,33 @@ xfs_bmapi_reserve_delalloc( > uint64_t fdblocks; > int error; > xfs_fileoff_t aoff = off; > + bool use_cowextszhint = > + whichfork == XFS_COW_FORK && !prealloc; > > /* > * Cap the alloc length. Keep track of prealloc so we know whether to > * tag the inode before we return. > */ > +retry: > alen = XFS_FILBLKS_MIN(len + prealloc, XFS_MAX_BMBT_EXTLEN); > if (!eof) > alen = XFS_FILBLKS_MIN(alen, got->br_startoff - aoff); > if (prealloc && alen >= len) > prealloc = alen - len; > > - /* Figure out the extent size, adjust alen */ > - if (whichfork == XFS_COW_FORK) { > + /* > + * If we're targeting the COW fork but aren't creating a speculative > + * posteof preallocation, try to expand the reservation to align with > + * the cow extent size hint if there's sufficient free space. > + * > + * Unlike the data fork, the CoW cancellation functions will free all > + * the reservations at inactivation, so we don't require that every > + * delalloc reservation have a dirty pagecache. > + * > + * XXX(hch): I can't see where we actually require dirty pagecache > + * for speculative data fork preallocations. What am I missing? IIRC a delalloc reservation in the data fork that isn't backing a dirty page will just sit there in the data fork and never get reclaimed. There's no writeback to turn it into an unwritten -> written extent. The blockgc functions won't (can't?) walk the pagecache to find clean regions that could be torn down. xfs destroy_inode just asserts on any reservations that it finds. --D > + */ > + if (use_cowextszhint) { > struct xfs_bmbt_irec prev; > xfs_extlen_t extsz = xfs_get_cowextsz_hint(ip); > > @@ -4090,7 +4104,7 @@ xfs_bmapi_reserve_delalloc( > */ > error = xfs_quota_reserve_blkres(ip, alen); > if (error) > - return error; > + goto out; > > /* > * Split changing sb for alen and indlen since they could be coming > @@ -4140,6 +4154,16 @@ xfs_bmapi_reserve_delalloc( > out_unreserve_quota: > if (XFS_IS_QUOTA_ON(mp)) > xfs_quota_unreserve_blkres(ip, alen); > +out: > + if (error == -ENOSPC || error == -EDQUOT) { > + trace_xfs_delalloc_enospc(ip, off, len); > + if (prealloc || use_cowextszhint) { > + /* retry without any preallocation */ > + prealloc = 0; > + use_cowextszhint = false; > + goto retry; > + } > + } > return error; > } > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 3783426739258c..34cce017fe7ce1 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -1148,27 +1148,13 @@ xfs_buffered_write_iomap_begin( > } > } > > -retry: > error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb, > end_fsb - offset_fsb, prealloc_blocks, > allocfork == XFS_DATA_FORK ? &imap : &cmap, > allocfork == XFS_DATA_FORK ? &icur : &ccur, > allocfork == XFS_DATA_FORK ? eof : cow_eof); > - switch (error) { > - case 0: > - break; > - case -ENOSPC: > - case -EDQUOT: > - /* retry without any preallocation */ > - trace_xfs_delalloc_enospc(ip, offset, count); > - if (prealloc_blocks) { > - prealloc_blocks = 0; > - goto retry; > - } > - fallthrough; > - default: > + if (error) > goto out_unlock; > - } > > if (allocfork == XFS_COW_FORK) { > trace_xfs_iomap_alloc(ip, offset, count, allocfork, &cmap);