RE: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to replay log before running xfs_repair

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Carlos Maiolino <cem@xxxxxxxxxx>
> Sent: 13 December 2022 03:10 PM
> To: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> Cc: linux-xfs@xxxxxxxxxxxxxxx; Darrick Wong <darrick.wong@xxxxxxxxxx>;
> Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@xxxxxxxxxx>;
> Junxiao Bi <junxiao.bi@xxxxxxxxxx>; david@xxxxxxxxxxxxx
> Subject: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to replay
> log before running xfs_repair
> 
> Hi Srikanth.
> 
> On Wed, Nov 23, 2022 at 12:00:50PM +0530, Srikanth C S wrote:
> > After a recent data center crash, we had to recover root filesystems
> > on several thousands of VMs via a boot time fsck. Since these machines
> > are remotely manageable, support can inject the kernel command line
> > with 'fsck.mode=force fsck.repair=yes' to kick off xfs_repair if the
> > machine won't come up or if they suspect there might be deeper issues
> > with latent errors in the fs metadata, which is what they did to try
> > to get everyone running ASAP while anticipating any future problems.
> > But, fsck.xfs does not address the journal replay in case of a crash.
> >
> > fsck.xfs does xfs_repair -e if fsck.mode=force is set. It is possible
> > that when the machine crashes, the fs is in inconsistent state with
> > the journal log not yet replayed. This can drop the machine into the
> > rescue shell because xfs_fsck.sh does not know how to clean the log.
> > Since the administrator told us to force repairs, address the
> > deficiency by cleaning the log and rerunning xfs_repair.
> >
> > Run xfs_repair -e when fsck.mode=force and repair=auto or yes.
> > Replay the logs only if fsck.mode=force and fsck.repair=yes. For other
> > option -fa and -f drop to the rescue shell if repair detects any
> > corruptions.
> >
> > Signed-off-by: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> > ---
> >  fsck/xfs_fsck.sh | 31 +++++++++++++++++++++++++++++--
> >  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> Did you by any chance wrote this patch on top of something else you have in
> your tree?
> 
> It doesn't apply to the tree without tweaking it, and the last changes we've in
> the fsck/xfs_fsck.sh file are from 2018, so I assume you have something
> before this patch in your tree.
> 
Sorry for the inconvenience, will verify this.

> Could you please rebase this patch against xfsprogs for-next and resend it?
> Feel free to keep my RwB as long as you don't change the code semantics.
> 
Let me rebase the patch and resend it. Thanks for the Reviewed by.

> Cheers.
> 
> >
> > diff --git a/fsck/xfs_fsck.sh b/fsck/xfs_fsck.sh index
> > 6af0f22..62a1e0b 100755
> > --- a/fsck/xfs_fsck.sh
> > +++ b/fsck/xfs_fsck.sh
> > @@ -31,10 +31,12 @@ repair2fsck_code() {
> >
> >  AUTO=false
> >  FORCE=false
> > +REPAIR=false
> >  while getopts ":aApyf" c
> >  do
> >         case $c in
> > -       a|A|p|y)        AUTO=true;;
> > +       a|A|p)          AUTO=true;;
> > +       y)              REPAIR=true;;
> >         f)              FORCE=true;;
> >         esac
> >  done
> > @@ -64,7 +66,32 @@ fi
> >
> >  if $FORCE; then
> >         xfs_repair -e $DEV
> > -       repair2fsck_code $?
> > +       error=$?
> > +       if [ $error -eq 2 ] && [ $REPAIR = true ]; then
> > +               echo "Replaying log for $DEV"
> > +               mkdir -p /tmp/repair_mnt || exit 1
> > +               for x in $(cat /proc/cmdline); do
> > +                       case $x in
> > +                               root=*)
> > +                                       ROOT="${x#root=}"
> > +                               ;;
> > +                               rootflags=*)
> > +                                       ROOTFLAGS="-o ${x#rootflags=}"
> > +                               ;;
> > +                       esac
> > +               done
> > +               test -b "$ROOT" || ROOT=$(blkid -t "$ROOT" -o device)
> > +               if [ $(basename $DEV) = $(basename $ROOT) ]; then
> > +                       mount $DEV /tmp/repair_mnt $ROOTFLAGS || exit 1
> > +               else
> > +                       mount $DEV /tmp/repair_mnt || exit 1
> > +               fi
> > +               umount /tmp/repair_mnt
> > +               xfs_repair -e $DEV
> > +               error=$?
> > +               rm -d /tmp/repair_mnt
> > +       fi
> > +       repair2fsck_code $error
> >         exit $?
> >  fi
> >
> > --
> > 1.8.3.1
> 
> --
> Carlos Maiolino

Thanks,
Srikanth




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux