On Wed, Apr 23, 2014 at 04:01:52PM -0400, Brian Foster wrote: > On Wed, Apr 23, 2014 at 04:19:45PM +1000, Dave Chinner wrote: > > On Thu, Apr 10, 2014 at 12:11:04PM -0400, Brian Foster wrote: > > > If one exists, scan the free inode btree in phase 2 of xfs_repair. > > > We use the same general infrastructure as for the inobt scan, but > > > trigger finobt chunk scan logic in in scan_inobt() via the magic > > > value. > > > > > > The new scan_single_finobt_chunk() function is similar to the inobt > > > equivalent with some finobt specific logic. We can expect that > > > underlying inode chunk blocks are already marked used due to the > > > previous inobt scan. We can also expect to find every record > > > tracked by the finobt already accounted for in the in-core tree > > > with equivalent (and internally consistent) inobt record data. > > > > > > Spit out a warning on any divergences from the above and add the > > > inodes referenced by the current finobt record to the appropriate > > > in-core tree. > > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > .... > > > + > > > + if (!suspect) { > > > + /* > > > + * inodes previously inserted into the uncertain tree should be > > > + * superceded by these when the uncertain tree is processed > > > + */ > > > + nfree = 0; > > > + if (XFS_INOBT_IS_FREE_DISK(rp, 0)) { > > > + nfree++; > > > + ino_rec = set_inode_free_alloc(mp, agno, ino); > > > + } else { > > > + ino_rec = set_inode_used_alloc(mp, agno, ino); > > > + } > > > + for (j = 1; j < XFS_INODES_PER_CHUNK; j++) { > > > + if (XFS_INOBT_IS_FREE_DISK(rp, j)) { > > > + nfree++; > > > + set_inode_free(ino_rec, j); > > > + } else { > > > + set_inode_used(ino_rec, j); > > > + } > > > + } > > > + } else { > > > + /* > > > + * this should handle the case where the inobt scan may have > > > + * already added uncertain inodes > > > + */ > > > + nfree = 0; > > > + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { > > > + if (XFS_INOBT_IS_FREE_DISK(rp, j)) { > > > + add_aginode_uncertain(mp, agno, ino + j, 1); > > > + nfree++; > > > + } else { > > > + add_aginode_uncertain(mp, agno, ino + j, 0); > > > + } > > > + } > > > + } > > > + > > > +check_freecount: > > > + > > > + if (nfree != be32_to_cpu(rp->ir_freecount)) { > > > + do_warn( > > > +_("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"), > > > + agno, ino, be32_to_cpu(rp->ir_freecount), nfree); > > > + } > > > + > > > + if (!nfree) { > > > + do_warn( > > > +_("finobt record with no free inodes, inode chunk %d/%u\n"), agno, ino); > > > + } > > > > Shouldn't both of these increment suspect? > > > > So at this point of the scan, we're processing through finobt leaf > blocks. Setting suspect has two side effects that I can see: > > 1.) Suppressing the additional, more granular checks for the > records/inodes that make up the remainder of the block. > > 2.) Setting bad_ino_btree, which causes us to skip phases 6 and 7. > > I think my thought process was that side effect #2 was generally too > heavy handed for a scenario where the freecount doesn't happen to match > or we've failed to remove a fully allocated record from the finobt. Yes, that's true. > Furthermore, we already increment suspect in this function if either the > allocation state of the inode chunk blocks or the allocation state of > any individual inode is unexpected based on the inobt state. The > additional value of the freecount checks is to catch any kind of > intra-record corruption or inter-freecount corruption across the trees, > which IIUC would be fixed naturally provided nothing else is wrong > (e.g., suspect remains 0) when the trees are regenerated. > > In other words, if inode allocation state does differ between the trees, > we already do set suspect. If these warnings fire without suspect being > set, it means we just need to fix up the freecount and there is no > broader tree corruption. Thoughts? Seems reasonable. Perhaps a comment explaining this would be a good idea - we dont have nearly enough comments in repair explaining how it is cross-validating different pieces of metadata... > NOTE: > > When looking through this again, I think there is a hole in the > allocation state logic in that we don't catch a case where the finobt > inode could be allocated and the inobt inode marked as free: > > if (first_rec) { > ... > > nfree = 0; > for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { > if (XFS_INOBT_IS_FREE_DISK(rp, j)) { > nfree++; > if (!suspect && !is_inode_free(first_rec, j)) > suspect++; > } > /* > * XXX: else if the finobt inode is allocated, confirm > * here that the inobt inode is allocated as well! > */ > } > goto check_freecount; > } > > I think something like this might work: > > if (first_rec) { > ... > > nfree = 0; > for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { > int isfree = XFS_INOBT_IS_FREE_DISK(rp, j); > > if (isfree) > nfree++; > > if (!suspect && > isfree != is_inode_free(first_rec, j)) > suspect++; > } > goto check_freecount; > } Yes, that makes sense - we need to validate in both directions. Again, a comment would be a useful reminder here. ;) Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs