On Mon, Jan 07, 2019 at 08:45:55AM -0500, Brian Foster wrote: > On Fri, Jan 04, 2019 at 04:29:16PM -0800, Darrick J. Wong wrote: > > On Fri, Jan 04, 2019 at 01:38:56PM -0500, Brian Foster wrote: > > > On Mon, Dec 31, 2018 at 06:09:10PM -0800, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > > > > > Teach scrub how to handle the case that there are one or more inobt > > > > records covering a given inode cluster. This fixes the operation on big > > > > block filesystems (e.g. 64k blocks, 512 byte inodes). > > > > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > --- > > > > fs/xfs/scrub/ialloc.c | 8 ++++++-- > > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c > > > > index 2f6c2d7fa3fd..1be6a5ebd61e 100644 > > > > --- a/fs/xfs/scrub/ialloc.c > > > > +++ b/fs/xfs/scrub/ialloc.c > > > > @@ -162,6 +162,7 @@ xchk_iallocbt_check_cluster_ifree( > > > > xfs_ino_t fsino; > > > > xfs_agino_t agino; > > > > unsigned int offset; > > > > + unsigned int cluster_buf_base; > > > > bool irec_free; > > > > bool ino_inuse; > > > > bool freemask_ok; > > > > @@ -174,11 +175,14 @@ xchk_iallocbt_check_cluster_ifree( > > > > * Given an inobt record, an offset of a cluster within the record, > > > > * and an offset of an inode within a cluster, compute which fs inode > > > > * we're talking about and the offset of the inode record within the > > > > - * inode buffer. > > > > + * inode buffer, being careful about inobt records that don't align > > > > + * with the start of the inode buffer when block sizes are large. > > > > */ > > > > agino = irec->ir_startino + cluster_base + cluster_index; > > > > fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino); > > > > - offset = cluster_index * mp->m_sb.sb_inodesize; > > > > + cluster_buf_base = XFS_INO_TO_OFFSET(mp, irec->ir_startino + > > > > + cluster_base); > > > > + offset = (cluster_buf_base + cluster_index) * mp->m_sb.sb_inodesize; > > > > > > So cluster_buf_base should always be 0 unless the record itself is not > > > cluster aligned (i.e., the second record in a large FSB), right? > > > > Right. For the (n > 0)th inobt record for a large FSB, ir_startino's > > block offset will be less than inopblock, so cluster_buf_base will be > > greater than zero. > > > > > If so, cluster_base should also be 0 in any case where > > > cluster_buf_base != 0. I'm wondering if that can be made a little > > > more explicit, or perhaps self-documented with an assert. > > > > Right. For the more typical case where there are multiple clusters for > > each inobt record, ir_startino's block offset will always point to the > > start of an inode cluster (and therefore cluster_buf_base will be zero) > > but cluster_base will increment as we visit each cluster in the inobt. > > > > I think this code can become: > > > > cluster_buf_base = XFS_INO_TO_OFFSET(mp, irec->ir_startino); > > ASSERT(cluster_buf_base == 0 || cluster_base == 0); > > offset = (cluster_buf_base + cluster_index) << mp->m_sb.sb_inopblog; > > > > But I'll go feed that to the test machines before I commit to that. :) > > > > > Thinking more about it, it's kind of confusing to me either way because > > > cluster_index isn't really a cluster index in this case, rather it's > > > more of a record index because it doesn't account for the fact that the > > > record is a subset of the cluster. Hmmm... could we simplify this by > > > setting imap.im_boffset appropriately in the caller such that dip always > > > points to either the first inode in the buffer or first inode in the > > > record, then just pass in dip directly and let the caller increment it > > > in the associated loop? Maybe that's something for another patch.. > > > > I think that would be possible... > > > > If so, I think it's worthwhile because then you'd only need to pass the > record, the inode index and the dip pointer to this function. AFAICT > you'd just need to pull up the cluster_buf_base logic here (along with > a useful comment) to the im_boffset assignment in the caller. <nod> It looks like it cleans things up quite a bit. --D > Brian > > > --D > > > > > Brian > > > > > > > if (offset >= BBTOB(cluster_bp->b_length)) { > > > > xchk_btree_set_corrupt(bs->sc, bs->cur, 0); > > > > goto out; > > > >