On Thu, Jun 28, 2018 at 10:55:16AM +1000, Dave Chinner wrote: > On Sun, Jun 24, 2018 at 12:24:13PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > Use the rmapbt to find inode chunks, query the chunks to compute > > hole and free masks, and with that information rebuild the inobt > > and finobt. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > [....] > > > +/* > > + * For each cluster in this blob of inode, we must calculate the > > + * properly aligned startino of that cluster, then iterate each > > + * cluster to fill in used and filled masks appropriately. We > > + * then use the (startino, used, filled) information to construct > > + * the appropriate inode records. > > + */ > > +STATIC int > > +xfs_repair_ialloc_process_cluster( > > + struct xfs_repair_ialloc *ri, > > + xfs_agblock_t agbno, > > + int blks_per_cluster, > > + xfs_agino_t rec_agino) > > +{ > > + struct xfs_imap imap; > > + struct xfs_repair_ialloc_extent *rie; > > + struct xfs_dinode *dip; > > + struct xfs_buf *bp; > > + struct xfs_scrub_context *sc = ri->sc; > > + struct xfs_mount *mp = sc->mp; > > + xfs_ino_t fsino; > > + xfs_inofree_t usedmask; > > + xfs_agino_t nr_inodes; > > + xfs_agino_t startino; > > + xfs_agino_t clusterino; > > + xfs_agino_t clusteroff; > > + xfs_agino_t agino; > > + uint16_t fillmask; > > + bool inuse; > > + int usedcount; > > + int error; > > + > > + /* The per-AG inum of this inode cluster. */ > > + agino = XFS_OFFBNO_TO_AGINO(mp, agbno, 0); > > + > > + /* The per-AG inum of the inobt record. */ > > + startino = rec_agino + rounddown(agino - rec_agino, > > + XFS_INODES_PER_CHUNK); > > + > > + /* The per-AG inum of the cluster within the inobt record. */ > > + clusteroff = agino - startino; > > + > > + /* Every inode in this holemask slot is filled. */ > > + nr_inodes = XFS_OFFBNO_TO_AGINO(mp, blks_per_cluster, 0); > > + fillmask = xfs_inobt_maskn(clusteroff / XFS_INODES_PER_HOLEMASK_BIT, > > + nr_inodes / XFS_INODES_PER_HOLEMASK_BIT); > > + > > + /* Grab the inode cluster buffer. */ > > + imap.im_blkno = XFS_AGB_TO_DADDR(mp, sc->sa.agno, agbno); > > + imap.im_len = XFS_FSB_TO_BB(mp, blks_per_cluster); > > + imap.im_boffset = 0; > > + > > + error = xfs_imap_to_bp(mp, sc->tp, &imap, &dip, &bp, 0, > > + XFS_IGET_UNTRUSTED); > > This is going to error out if the cluster we are asking to be mapped > has no record in the inobt. It does? xfs_imap_to_bp is a straightforward wrapper around xfs_trans_read_buf and xfs_buf_offset; it never consults the inobt. If the inode buffer verifiers trigger then yes we'll blow out to userspace, but the inobt can be totally trashed and that won't cause this to fail. <confused> > Aren't we trying to rebuild the inobt here from the rmap's idea of > on-disk clusters? So how do we rebuild the inobt record if we can't > already find the chunk record in the inobt? > > At minimum, this needs a comment explaining why it works. /* * Having manually mapped part of a reverse-mapping record to an inode * cluster map, use the map to read the inode cluster directly off the * disk. */ > > +/* Initialize new inobt/finobt roots and implant them into the AGI. */ > > +STATIC int > > +xfs_repair_iallocbt_reset_btrees( > > + struct xfs_scrub_context *sc, > > + struct xfs_owner_info *oinfo, > > + int *log_flags) > > +{ > > + struct xfs_agi *agi; > > + struct xfs_buf *bp; > > + struct xfs_mount *mp = sc->mp; > > + xfs_fsblock_t inofsb; > > + xfs_fsblock_t finofsb; > > + enum xfs_ag_resv_type resv; > > + int error; > > + > > + agi = XFS_BUF_TO_AGI(sc->sa.agi_bp); > > + > > + /* Initialize new inobt root. */ > > + resv = XFS_AG_RESV_NONE; > > + error = xfs_repair_alloc_ag_block(sc, oinfo, &inofsb, resv); > > + if (error) > > + return error; > > + error = xfs_repair_init_btblock(sc, inofsb, &bp, XFS_BTNUM_INO, > > + &xfs_inobt_buf_ops); > > + if (error) > > + return error; > > + agi->agi_root = cpu_to_be32(XFS_FSB_TO_AGBNO(mp, inofsb)); > > + agi->agi_level = cpu_to_be32(1); > > + *log_flags |= XFS_AGI_ROOT | XFS_AGI_LEVEL; > > + > > + /* Initialize new finobt root. */ > > + if (!xfs_sb_version_hasfinobt(&mp->m_sb)) > > + return 0; > > + > > + resv = mp->m_inotbt_nores ? XFS_AG_RESV_NONE : XFS_AG_RESV_METADATA; > > Comment explaining this? m_inotbt_nores (which, ugh, why isn't that xfs_finobt_nores?) indicates if we suceeded at making per-AG reservations for finobt expansion. If not, then don't bother. /* * If we successfully reserved space for finobt expansion, use that * reservation for the rebuilt btree. */ > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html