Re: [PATCH 6/8] xfs: scrub big block inode btrees correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 07, 2019 at 08:45:55AM -0500, Brian Foster wrote:
> On Fri, Jan 04, 2019 at 04:29:16PM -0800, Darrick J. Wong wrote:
> > On Fri, Jan 04, 2019 at 01:38:56PM -0500, Brian Foster wrote:
> > > On Mon, Dec 31, 2018 at 06:09:10PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > 
> > > > Teach scrub how to handle the case that there are one or more inobt
> > > > records covering a given inode cluster.  This fixes the operation on big
> > > > block filesystems (e.g. 64k blocks, 512 byte inodes).
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > ---
> > > >  fs/xfs/scrub/ialloc.c |    8 ++++++--
> > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c
> > > > index 2f6c2d7fa3fd..1be6a5ebd61e 100644
> > > > --- a/fs/xfs/scrub/ialloc.c
> > > > +++ b/fs/xfs/scrub/ialloc.c
> > > > @@ -162,6 +162,7 @@ xchk_iallocbt_check_cluster_ifree(
> > > >  	xfs_ino_t			fsino;
> > > >  	xfs_agino_t			agino;
> > > >  	unsigned int			offset;
> > > > +	unsigned int			cluster_buf_base;
> > > >  	bool				irec_free;
> > > >  	bool				ino_inuse;
> > > >  	bool				freemask_ok;
> > > > @@ -174,11 +175,14 @@ xchk_iallocbt_check_cluster_ifree(
> > > >  	 * Given an inobt record, an offset of a cluster within the record,
> > > >  	 * and an offset of an inode within a cluster, compute which fs inode
> > > >  	 * we're talking about and the offset of the inode record within the
> > > > -	 * inode buffer.
> > > > +	 * inode buffer, being careful about inobt records that don't align
> > > > +	 * with the start of the inode buffer when block sizes are large.
> > > >  	 */
> > > >  	agino = irec->ir_startino + cluster_base + cluster_index;
> > > >  	fsino = XFS_AGINO_TO_INO(mp, bs->cur->bc_private.a.agno, agino);
> > > > -	offset = cluster_index * mp->m_sb.sb_inodesize;
> > > > +	cluster_buf_base = XFS_INO_TO_OFFSET(mp, irec->ir_startino +
> > > > +					     cluster_base);
> > > > +	offset = (cluster_buf_base + cluster_index) * mp->m_sb.sb_inodesize;
> > > 
> > > So cluster_buf_base should always be 0 unless the record itself is not
> > > cluster aligned (i.e., the second record in a large FSB), right?
> > 
> > Right.  For the (n > 0)th inobt record for a large FSB, ir_startino's
> > block offset will be less than inopblock, so cluster_buf_base will be
> > greater than zero.
> > 
> > > If so, cluster_base should also be 0 in any case where
> > > cluster_buf_base != 0.  I'm wondering if that can be made a little
> > > more explicit, or perhaps self-documented with an assert.
> > 
> > Right.  For the more typical case where there are multiple clusters for
> > each inobt record, ir_startino's block offset will always point to the
> > start of an inode cluster (and therefore cluster_buf_base will be zero)
> > but cluster_base will increment as we visit each cluster in the inobt.
> > 
> > I think this code can become:
> > 
> > 	cluster_buf_base = XFS_INO_TO_OFFSET(mp, irec->ir_startino);
> > 	ASSERT(cluster_buf_base == 0 || cluster_base == 0);
> > 	offset = (cluster_buf_base + cluster_index) << mp->m_sb.sb_inopblog;
> > 
> > But I'll go feed that to the test machines before I commit to that. :)
> > 
> > > Thinking more about it, it's kind of confusing to me either way because
> > > cluster_index isn't really a cluster index in this case, rather it's
> > > more of a record index because it doesn't account for the fact that the
> > > record is a subset of the cluster. Hmmm... could we simplify this by
> > > setting imap.im_boffset appropriately in the caller such that dip always
> > > points to either the first inode in the buffer or first inode in the
> > > record, then just pass in dip directly and let the caller increment it
> > > in the associated loop? Maybe that's something for another patch..
> > 
> > I think that would be possible...
> > 
> 
> If so, I think it's worthwhile because then you'd only need to pass the
> record, the inode index and the dip pointer to this function. AFAICT
> you'd just need to pull up the cluster_buf_base logic here (along with
> a useful comment) to the im_boffset assignment in the caller.

<nod> It looks like it cleans things up quite a bit.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > >  	if (offset >= BBTOB(cluster_bp->b_length)) {
> > > >  		xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
> > > >  		goto out;
> > > > 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux