On Fri, Jan 04, 2019 at 12:59:06PM -0800, Darrick J. Wong wrote: > On Fri, Jan 04, 2019 at 01:31:04PM -0500, Brian Foster wrote: > > On Mon, Dec 31, 2018 at 06:08:39PM -0800, Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > > > In xchk_iallocbt_rec, check the alignment of ir_startino by converting > > > the inode cluster block alignment into units of inodes instead of the > > > other way around (converting ir_startino to blocks). This prevents us > > > from tripping over off-by-one errors in ir_startino which are obscured > > > by the inode -> block conversion. > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > --- > > > fs/xfs/scrub/ialloc.c | 45 +++++++++++++++++++++++++++++++++++++++------ > > > 1 file changed, 39 insertions(+), 6 deletions(-) > > > > > > > > > diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c > > > index fd431682db0b..5082331d6c03 100644 > > > --- a/fs/xfs/scrub/ialloc.c > > > +++ b/fs/xfs/scrub/ialloc.c > > > @@ -265,6 +265,42 @@ xchk_iallocbt_check_freemask( > > > return error; > > > } > > > > > > +/* Make sure this inode btree record is aligned properly. */ > > > +STATIC void > > > +xchk_iallocbt_rec_alignment( > > > + struct xchk_btree *bs, > > > + struct xfs_inobt_rec_incore *irec) > > > +{ > > > + struct xfs_mount *mp = bs->sc->mp; > > > + > > > + /* > > > + * finobt records have different positioning requirements than inobt > > > + * records: each finobt record must have a corresponding inobt record. > > > + * That is checked in the xref function, so for now we only catch the > > > + * obvious case where the record isn't even chunk-aligned. > > > + * > > > + * Note also that if a fs block contains more than a single chunk of > > > + * inodes, we will have finobt records only for those chunks containing > > > + * free inodes. > > > + */ > > > + if (bs->cur->bc_btnum == XFS_BTNUM_FINO) { > > > + if (irec->ir_startino & (XFS_INODES_PER_CHUNK - 1)) > > > + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); > > > + return; > > > + } > > > > Is the above really a finobt only check? Couldn't we run this > > sanity check against all records and skip the following for finobt? > > Uhoh, it occurs to me that in the 4kblock !spinodes case we can have > inobt records (and therefore finobt records) that are aligned to > m_cluster_align_inodes, and that value can be less than 64. So I think > this has to be something along the lines of: > Er, yeah. IIRC spinodes required slightly larger inode chunk alignment than traditionally required in order to ensure we don't create conflicting/overlapping sparse records. A quick look at mkfs shows that the cluster size basically defines the sparse chunk size and the chunk alignment == chunk size. > imask = min(XFS_INODES_PER_CHUNK, mp->m_cluster_align_inodes) - 1; > if (irec->ir_startino & imask) > /* set corrupt... */ > > Hmm, and testing seems to bear this out. New patch forthcoming. > Ok, I take it the min() is required because m_cluster_align_inodes can be multiple records in the large FSB case. If so, I wonder if it would be more simple to use m_cluster_align, but otherwise a one-liner comment couldn't hurt. > > Otherwise seems fine: > > > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > > I've wondered in recent days if this is even necessary at all -- when > we're asked to check the inobt we check the ir_startino alignment of all > those records, so really the only thing we need is the existing check > that for each finobt record there's also an inobt record with the same > ir_startino. OTOH I guess we shouldn't really assume that the calling > process already checked the inobt or that it didn't change between calls. > I guess it makes sense to verify the applicable records match, but in general I agree that "1. verify the inobt 2. verify the finobt is a subset of the inobt" is probably sufficient (and more elegant logic). Brian > --D > > > > + > > > + /* inobt records must be aligned to cluster and inoalignmnt size. */ > > > + if (irec->ir_startino & (mp->m_cluster_align_inodes - 1)) { > > > + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); > > > + return; > > > + } > > > + > > > + if (irec->ir_startino & (mp->m_inodes_per_cluster - 1)) { > > > + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); > > > + return; > > > + } > > > +} > > > + > > > /* Scrub an inobt/finobt record. */ > > > STATIC int > > > xchk_iallocbt_rec( > > > @@ -277,7 +313,6 @@ xchk_iallocbt_rec( > > > uint64_t holes; > > > xfs_agnumber_t agno = bs->cur->bc_private.a.agno; > > > xfs_agino_t agino; > > > - xfs_agblock_t agbno; > > > xfs_extlen_t len; > > > int holecount; > > > int i; > > > @@ -304,11 +339,9 @@ xchk_iallocbt_rec( > > > goto out; > > > } > > > > > > - /* Make sure this record is aligned to cluster and inoalignmnt size. */ > > > - agbno = XFS_AGINO_TO_AGBNO(mp, irec.ir_startino); > > > - if ((agbno & (mp->m_cluster_align - 1)) || > > > - (agbno & (mp->m_blocks_per_cluster - 1))) > > > - xchk_btree_set_corrupt(bs->sc, bs->cur, 0); > > > + xchk_iallocbt_rec_alignment(bs, &irec); > > > + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) > > > + goto out; > > > > > > iabt->inodes += irec.ir_count; > > > > > >