On Tue, Aug 20, 2024 at 06:29:59PM +0200, Christoph Hellwig wrote: > COW writes remove the amount overwritten either directly for delalloc > reservations, or in earlier deferred transactions than adding the new > amount back in the bmap map transaction. This means st_blocks on an > inode where all data is overwritten using the COW path can temporarily > show a 0 st_blocks. This can easily be reproduced with the pending > zoned device support where all writes use this path and trips the > check in generic/615, but could also happen on a reflink file without > that. > > Fix this by temporarily add the pending blocks to be mapped to > i_delayed_blks while the item is queued. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> How hard is it to hit this race?a I guess all you have to do is statx in a loop while doing a bunch of cow writeback? > --- > fs/xfs/libxfs/xfs_bmap.c | 1 + > fs/xfs/xfs_bmap_item.c | 18 ++++++++++++++++++ > 2 files changed, 19 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index 7df74c35d9f900..a63be14a9873e8 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -4847,6 +4847,7 @@ xfs_bmapi_remap( > } > > ip->i_nblocks += len; > + ip->i_delayed_blks -= len; This proabably ought to have a comment to reference xfs_bmap_defer_add. > xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > > if (ifp->if_format == XFS_DINODE_FMT_BTREE) > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c > index e224b49b7cff6d..fc5da2dc7c1c66 100644 > --- a/fs/xfs/xfs_bmap_item.c > +++ b/fs/xfs/xfs_bmap_item.c > @@ -346,6 +346,18 @@ xfs_bmap_defer_add( > trace_xfs_bmap_defer(bi); > > xfs_bmap_update_get_group(tp->t_mountp, bi); > + > + /* > + * Ensure the deferred mapping is pre-recorded in i_delayed_blks. > + * > + * Otherwise stat can report zero blocks for an inode that actually has > + * data when the entire mapping is in the process of being overwritten > + * using the out of place write path. This is undone in after > + * xfs_bmapi_remap has incremented di_nblocks for a successful > + * operation. > + */ > + if (bi->bi_type == XFS_BMAP_MAP) > + bi->bi_owner->i_delayed_blks += bi->bi_bmap.br_blockcount; > xfs_defer_add(tp, &bi->bi_list, &xfs_bmap_update_defer_type); > } > > @@ -367,6 +379,9 @@ xfs_bmap_update_cancel_item( > { > struct xfs_bmap_intent *bi = bi_entry(item); > > + if (bi->bi_type == XFS_BMAP_MAP) > + bi->bi_owner->i_delayed_blks -= bi->bi_bmap.br_blockcount; > + > xfs_bmap_update_put_group(bi); > kmem_cache_free(xfs_bmap_intent_cache, bi); > } > @@ -464,6 +479,9 @@ xfs_bui_recover_work( > bi->bi_owner = *ipp; > xfs_bmap_update_get_group(mp, bi); > > + /* see __xfs_bmap_add for details */ xfs_bmap_defer_add? --D > + if (bi->bi_type == XFS_BMAP_MAP) > + bi->bi_owner->i_delayed_blks += bi->bi_bmap.br_blockcount; > xfs_defer_add_item(dfp, &bi->bi_list); > return bi; > } > -- > 2.43.0 > >