On Fri, Dec 13, 2024 at 06:22:10AM +0100, Christoph Hellwig wrote: > On Thu, Dec 12, 2024 at 02:02:20PM -0800, Darrick J. Wong wrote: > > On Wed, Dec 11, 2024 at 09:54:40AM +0100, Christoph Hellwig wrote: > > > Zone file systems reuse the basic RT group enabled XFS file system > > > structure to support a mode where each RT group is always written from > > > start to end and then reset for reuse (after moving out any remaining > > > data). There are few minor but important changes, which are indicated > > > by a new incompat flag: > > > > > > 1) there are no bitmap and summary inodes, and thus the sb_rbmblocks > > > superblock field must be cleared to zero > > > > zoned rt requires rt rmap and reflink, and hence metadir. There is no > > such field as sb_rbmblocks anymore. > > I doesn't actually require reflink - in fact it currently is incompatible > with reflink due to GC not understanding refcounts (it does depend on > your reflink code as it's reusing a few bits from that just to make it > confusing). > > And sb_rbmblocks is actually still set for metadir file systems. Oops, I misread that as sb_rbmino. Yes, sb_rbmblocks must be zero now. > > > 4) an overlay of the cowextsizse field for the rtrmap inode so that we > > > > cowextsize > > > > > can persistently track the total amount of bytes currently used in > > > > Isn't this the total number of *fsblocks* currently used? > > or rtblocks? :) But yes, it's not byte granularity obviously, no idea > why I wrote that. > > > Heh, I guess I should go down to my lab and plug in this smr disk and > > see how many zones it reports... > > It will be capacity in bytes / 256MB unless you found a really, really > weird beast. I bet someone will get tempted to make bigger zones for their 120TB hard disk. > > > - fa = xfs_inode_validate_cowextsize(mp, be32_to_cpu(dip->di_cowextsize), > > > - mode, flags, flags2); > > > - if (fa) > > > - return fa; > > > + if (!xfs_has_zoned(mp) || > > > + dip->di_metatype != cpu_to_be16(XFS_METAFILE_RTRMAP)) { > > > + /* COW extent size hint validation */ > > > + fa = xfs_inode_validate_cowextsize(mp, > > > + be32_to_cpu(dip->di_cowextsize), > > > + mode, flags, flags2); > > > > I think there's *some* validation you could do, such as checking that > > i_cowextsize <= the number of blocks in the rtgroup. > > So we do a fair amount of validation in xfs_zone_validate based on the > hardware zone state. I tried to add more here but it failed because > we getting at the rtgroups wasn't easily possible. But yes, I think > a simple rgsize check should be possible at least. > > > I almost wonder if you should add that kind of logic to > > xfs_inode_validate_cowextsize but that might be one incoherence too > > many. OTOH it would probably reduce the number of changes in the fsck > > code. > > I'll take a look, but having a cowextsize helper that validated a field > overlay with an entirely different meaning sounds a bit confusing. Yeah, like I said, it might be one incoherence too far. :) --D