On Thu, Feb 04, 2021 at 12:55:12PM -0500, Brian Foster wrote: > On Wed, Feb 03, 2021 at 11:43:40AM -0800, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Clear the needsrepair flag, since it's used to prevent mounting of an > > inconsistent filesystem. We only do this if we make it to the end of > > repair with a non-zero error code, and all the rebuilt indices and > > corrected metadata are persisted correctly. > > > > Note that we cannot combine clearing needsrepair with clearing the quota > > checked flags because we need to clear the quota flags even if > > reformatting the log fails, whereas we can't clear needsrepair if the > > log reformat fails. > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > Looks sane, just some nits... > > > include/xfs_mount.h | 1 + > > libxfs/init.c | 12 ++++++++---- > > repair/agheader.c | 21 +++++++++++++++++++++ > > repair/xfs_repair.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ > > 4 files changed, 79 insertions(+), 4 deletions(-) > > > > > ... > > diff --git a/libxfs/init.c b/libxfs/init.c > > index 9fe13b8d..99b1f72a 100644 > > --- a/libxfs/init.c > > +++ b/libxfs/init.c > > @@ -870,9 +870,10 @@ _("%s: Flushing the %s failed, err=%d!\n"), > > * Flush all dirty buffers to stable storage and report on writes that didn't > > * make it to stable storage. > > */ > > -static int > > +int > > libxfs_flush_mount( > > - struct xfs_mount *mp) > > + struct xfs_mount *mp, > > + bool purge) > > { > > int error = 0; > > int err2; > > @@ -884,7 +885,10 @@ libxfs_flush_mount( > > * cannot be written will cause the LOST_WRITE flag to be set in the > > * buftarg. > > */ > > - libxfs_bcache_purge(); > > + if (purge) > > + libxfs_bcache_purge(); > > + else > > + libxfs_bcache_flush(); > > Instead of the parameter, could we just lift the purge into the call > that requires it and let libxfs_flush_mount() just do flushes? I'm > assuming the bcache would be empty in the umount case so the extra flush > should pretty much be a no-op. <nod> Will do. > > > > > /* Flush all kernel and disk write caches, and report failures. */ > > if (mp->m_ddev_targp) { > > @@ -923,7 +927,7 @@ libxfs_umount( > > > > libxfs_rtmount_destroy(mp); > > > > - error = libxfs_flush_mount(mp); > > + error = libxfs_flush_mount(mp, true); > > > > for (agno = 0; agno < mp->m_maxagi; agno++) { > > pag = radix_tree_delete(&mp->m_perag_tree, agno); > ... > > diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c > > index 9409f0d8..4ca4fe5a 100644 > > --- a/repair/xfs_repair.c > > +++ b/repair/xfs_repair.c > > @@ -712,6 +712,52 @@ check_fs_vs_host_sectsize( > > } > > } > > > > +/* Clear needsrepair after a successful repair run. */ > > +void > > +clear_needsrepair( > > + struct xfs_mount *mp) > > +{ > > + struct xfs_buf *bp; > > + int error; > > + > > + /* > > + * If we're going to clear NEEDSREPAIR, we need to make absolutely sure > > + * that everything is ok with the ondisk filesystem. At this point > > + * we've flushed the filesystem metadata out of the buffer cache and > > + * possibly rewrote the log, but we haven't forced the disks to persist > > + * the writes to stable storage. Do that now, and if anything goes > > + * wrong, leave NEEDSREPAIR in place. Don't purge the buffer cache > > + * here since we're not done yet. > > + */ > > + error = -libxfs_flush_mount(mp, false); > > + if (error) { > > + do_warn( > > + _("Cannot clear needsrepair from primary super due to metadata checkpoint failure, err=%d.\n"), > > + error); > > Not sure what metadata checkpoint failure means.. maybe just say that a > flush failed? Ok. > > + return; > > + } > > + > > + /* Clear needsrepair from the superblock. */ > > + bp = libxfs_getsb(mp); > > + if (!bp) { > > + do_warn( > > + _("Cannot clear needsrepair from primary super, out of memory.\n")); > > + return; > > + } > > + if (bp->b_error) { > > + do_warn( > > + _("Cannot clear needsrepair from primary super, IO err=%d.\n"), > > + bp->b_error); > > + } else { > > Maybe try to condense this a bit to something like the following to > reduce the number of branches and strings to translate and whatnot: > > if (!bp || bp->b_error) { > do_warn( > "Failed to clear needsrepair from primary super, err=%d.\n", > bp ? bp->b_error : -ENOMEM); > goto out; > } > > ... > out: > libxfs_buf_release(bp); Ok. > } > > > + mp->m_sb.sb_features_incompat &= > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR; > > + libxfs_sb_to_disk(bp->b_addr, &mp->m_sb); > > + libxfs_buf_mark_dirty(bp); > > + } > > + libxfs_buf_relse(bp); > > + return; > > No need for the return statement here. Fixed, thanks for the nits. :) --D > Brian > > > +} > > + > > int > > main(int argc, char **argv) > > { > > @@ -1132,6 +1178,9 @@ _("Note - stripe unit (%d) and width (%d) were copied from a backup superblock.\ > > libxfs_bcache_flush(); > > format_log_max_lsn(mp); > > > > + if (xfs_sb_version_needsrepair(&mp->m_sb)) > > + clear_needsrepair(mp); > > + > > /* Report failure if anything failed to get written to our fs. */ > > error = -libxfs_umount(mp); > > if (error) > > >