On Mon, Aug 26, 2024 at 11:27:34AM -0700, Darrick J. Wong wrote: > On Mon, Aug 26, 2024 at 12:06:58PM +1000, Dave Chinner wrote: > > On Thu, Aug 22, 2024 at 05:20:07PM -0700, Darrick J. Wong wrote: > > > From: Christoph Hellwig <hch@xxxxxx> > > > > > > Split the check that the rtsummary fits into the log into a separate > > > helper, and use xfs_growfs_rt_alloc_fake_mount to calculate the new RT > > > geometry. > > > > > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > [djwong: avoid division for the 0-rtx growfs check] > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > --- > > > fs/xfs/xfs_rtalloc.c | 43 +++++++++++++++++++++++++++++-------------- > > > 1 file changed, 29 insertions(+), 14 deletions(-) > > > > > > > > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c > > > index 61231b1dc4b79..78a3879ad6193 100644 > > > --- a/fs/xfs/xfs_rtalloc.c > > > +++ b/fs/xfs/xfs_rtalloc.c > > > @@ -1023,6 +1023,31 @@ xfs_growfs_rtg( > > > return error; > > > } > > > > > > +static int > > > +xfs_growfs_check_rtgeom( > > > + const struct xfs_mount *mp, > > > + xfs_rfsblock_t rblocks, > > > + xfs_extlen_t rextsize) > > > +{ > > > + struct xfs_mount *nmp; > > > + int error = 0; > > > + > > > + nmp = xfs_growfs_rt_alloc_fake_mount(mp, rblocks, rextsize); > > > + if (!nmp) > > > + return -ENOMEM; > > > + > > > + /* > > > + * New summary size can't be more than half the size of the log. This > > > + * prevents us from getting a log overflow, since we'll log basically > > > + * the whole summary file at once. > > > + */ > > > + if (nmp->m_rsumblocks > (mp->m_sb.sb_logblocks >> 1)) > > > + error = -EINVAL; > > > > FWIW, the new size needs to be smaller than that, because the "half > > the log size" must to include all the log metadata needed to > > encapsulate that object. The grwofs transaction also logs inodes and > > the superblock, so that also takes away from the maximum size of > > the summary file.... > > <shrug> It's the same logic as what's there now, and there haven't been > any bug reports, have there? No, none that I know of - it was just an observation that the code doesn't actually guarantee what the comment says it should do. > Though I suppose that's just a reduction > of what? One block for the rtbitmap, and (conservatively) two inodes > and a superblock? The rtbitmap update might touch a lot more than one block. The newly allocated space in the rtbitmap inode is initialised to zeros, and so the xfs_rtfree_range() call from the growfs code to mark the new space free has to write all 1s to that range of the rtbitmap. This is all done in a single transaction, so we might actually be logging a *lot* of rtbitmap buffers here. IIRC, there is a bit per rtextent, so in a 4kB buffer we can mark 32768 rtextents free. If they are 4kB each, then that's 128MB of space tracked per rtbitmap block. This adds up to roughly 3.5MB of log space for the rtbitmap updates per TB of grown rtdev space.... So, yeah, I think that calculation and comment is inaccurate, but we don't have to fix this right now. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx