RE: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to replay log before running xfs_repair

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> Sent: 29 November 2022 04:34 AM
> To: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> Cc: Carlos Maiolino <cem@xxxxxxxxxx>; linux-xfs@xxxxxxxxxxxxxxx; Darrick
> Wong <darrick.wong@xxxxxxxxxx>; Rajesh Sivaramasubramaniom
> <rajesh.sivaramasubramaniom@xxxxxxxxxx>; Junxiao Bi
> <junxiao.bi@xxxxxxxxxx>; david@xxxxxxxxxxxxx
> Subject: Re: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to
> replay log before running xfs_repair
> 
> On Fri, Nov 25, 2022 at 12:09:39PM +0000, Srikanth C S wrote:
> >
> >
> > > -----Original Message-----
> > > From: Carlos Maiolino <cem@xxxxxxxxxx>
> > > Sent: 23 November 2022 05:53 PM
> > > To: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> > > Cc: linux-xfs@xxxxxxxxxxxxxxx; Darrick Wong
> > > <darrick.wong@xxxxxxxxxx>; Rajesh Sivaramasubramaniom
> > > <rajesh.sivaramasubramaniom@xxxxxxxxxx>;
> > > Junxiao Bi <junxiao.bi@xxxxxxxxxx>; david@xxxxxxxxxxxxx
> > > Subject: Re: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs
> > > fs to replay log before running xfs_repair
> > >
> > > On Wed, Nov 23, 2022 at 11:40:53AM +0000, Srikanth C S wrote:
> > > >    Hi
> > > >
> > > >    I resent the same patch as I did not see any review comments.
> > >
> > > Unless I'm looking at the wrong patch, there were comments on your
> > > previous
> > > submission:
> > >
> > > https://urldefense.com/v3/__https://lore.kernel.org/linux-
> > > xfs/Y2ie54fcHDx5bcG4@B-P7TQMD6M-
> > > 0146.local/T/*t__;Iw!!ACWV5N9M2RV99hQ!J2Z-
> > >
> 2NThyyDm__z9ivhioF9QoHsaHh4Tk733jtNbVMPGeA2vbmbw3h4ZGxOywQF
> > > v_lA1Zs_jsUgr$
> > >
> > > Am I missing something?
> 
> Err.... whose comments, Joseph's or Gao's?
> 
> > All the previous comments addressing this patch were about having
> > journal replay code in the userspace. But Darricks comments indicate
> > that this requires making the log endian safe because of kernel's
> > inability to recover a log from a platform with a different
> > endianness.
> >
> > So I am still wondering on how to proceed with this patch. Any
> > comments would be helpful.
> 
@Carlos Maiolino, Any comments or thoughts on this patch?

-Srikanth
> Same here, though the long holiday weekend probably didn't help.
> 
> --D
> 
> > > Also, if you are sending the same patch, you can 'flag' it as a
> > > resend, so, it's easier to identify you are simply resending the
> > > same patch. You can do it by appending/prepending 'RESEND', to the
> patch tag:
> > >
> > > [RESEND PATCH] <subject>
> > Thanks for the info. Didn't know this.
> > >
> > > Cheers.
> > >
> > > >
> > > >    -Srikanth
> > > >
> > > >
> > >
> __________________________________________________________
> > > ________
> > > >
> > > >    From: Carlos Maiolino <cem@xxxxxxxxxx>
> > > >    Sent: Wednesday, November 23, 2022 2:06 PM
> > > >    To: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> > > >    Cc: linux-xfs@xxxxxxxxxxxxxxx <linux-xfs@xxxxxxxxxxxxxxx>; Darrick
> Wong
> > > >    <darrick.wong@xxxxxxxxxx>; Rajesh Sivaramasubramaniom
> > > >    <rajesh.sivaramasubramaniom@xxxxxxxxxx>; Junxiao Bi
> > > >    <junxiao.bi@xxxxxxxxxx>; david@xxxxxxxxxxxxx
> > > <david@xxxxxxxxxxxxx>
> > > >    Subject: [External] : Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to
> > > >    replay log before running xfs_repair
> > > >
> > > >    Hi.
> > > >    Did you plan to resend V3 again, or is this supposed to be V4?
> > > >    On Wed, Nov 23, 2022 at 12:00:50PM +0530, Srikanth C S wrote:
> > > >    > After a recent data center crash, we had to recover root filesystems
> > > >    > on several thousands of VMs via a boot time fsck. Since these
> > > >    > machines are remotely manageable, support can inject the kernel
> > > >    > command line with 'fsck.mode=force fsck.repair=yes' to kick off
> > > >    > xfs_repair if the machine won't come up or if they suspect there
> > > >    > might be deeper issues with latent errors in the fs metadata, which
> > > >    > is what they did to try to get everyone running ASAP while
> > > >    > anticipating any future problems. But, fsck.xfs does not address the
> > > >    > journal replay in case of a crash.
> > > >    >
> > > >    > fsck.xfs does xfs_repair -e if fsck.mode=force is set. It is
> > > >    > possible that when the machine crashes, the fs is in inconsistent
> > > >    > state with the journal log not yet replayed. This can drop the
> > > >    machine
> > > >    > into the rescue shell because xfs_fsck.sh does not know how to
> clean
> > > >    the
> > > >    > log. Since the administrator told us to force repairs, address the
> > > >    > deficiency by cleaning the log and rerunning xfs_repair.
> > > >    >
> > > >    > Run xfs_repair -e when fsck.mode=force and repair=auto or yes.
> > > >    > Replay the logs only if fsck.mode=force and fsck.repair=yes. For
> > > >    > other option -fa and -f drop to the rescue shell if repair detects
> > > >    > any corruptions.
> > > >    >
> > > >    > Signed-off-by: Srikanth C S <srikanth.c.s@xxxxxxxxxx>
> > > >    > ---
> > > >    >  fsck/xfs_fsck.sh | 31 +++++++++++++++++++++++++++++--
> > > >    >  1 file changed, 29 insertions(+), 2 deletions(-)
> > > >    >
> > > >    > diff --git a/fsck/xfs_fsck.sh b/fsck/xfs_fsck.sh
> > > >    > index 6af0f22..62a1e0b 100755
> > > >    > --- a/fsck/xfs_fsck.sh
> > > >    > +++ b/fsck/xfs_fsck.sh
> > > >    > @@ -31,10 +31,12 @@ repair2fsck_code() {
> > > >    >
> > > >    >  AUTO=false
> > > >    >  FORCE=false
> > > >    > +REPAIR=false
> > > >    >  while getopts ":aApyf" c
> > > >    >  do
> > > >    >         case $c in
> > > >    > -       a|A|p|y)        AUTO=true;;
> > > >    > +       a|A|p)          AUTO=true;;
> > > >    > +       y)              REPAIR=true;;
> > > >    >         f)              FORCE=true;;
> > > >    >         esac
> > > >    >  done
> > > >    > @@ -64,7 +66,32 @@ fi
> > > >    >
> > > >    >  if $FORCE; then
> > > >    >         xfs_repair -e $DEV
> > > >    > -       repair2fsck_code $?
> > > >    > +       error=$?
> > > >    > +       if [ $error -eq 2 ] && [ $REPAIR = true ]; then
> > > >    > +               echo "Replaying log for $DEV"
> > > >    > +               mkdir -p /tmp/repair_mnt || exit 1
> > > >    > +               for x in $(cat /proc/cmdline); do
> > > >    > +                       case $x in
> > > >    > +                               root=*)
> > > >    > +                                       ROOT="${x#root=}"
> > > >    > +                               ;;
> > > >    > +                               rootflags=*)
> > > >    > +                                       ROOTFLAGS="-o
> > > >    ${x#rootflags=}"
> > > >    > +                               ;;
> > > >    > +                       esac
> > > >    > +               done
> > > >    > +               test -b "$ROOT" || ROOT=$(blkid -t "$ROOT" -o device)
> > > >    > +               if [ $(basename $DEV) = $(basename $ROOT) ]; then
> > > >    > +                       mount $DEV /tmp/repair_mnt $ROOTFLAGS || exit
> > > >    1
> > > >    > +               else
> > > >    > +                       mount $DEV /tmp/repair_mnt || exit 1
> > > >    > +               fi
> > > >    > +               umount /tmp/repair_mnt
> > > >    > +               xfs_repair -e $DEV
> > > >    > +               error=$?
> > > >    > +               rm -d /tmp/repair_mnt
> > > >    > +       fi
> > > >    > +       repair2fsck_code $error
> > > >    >         exit $?
> > > >    >  fi
> > > >    >
> > > >    > --
> > > >    > 1.8.3.1
> > > >    --
> > > >    Carlos Maiolino
> > >
> > > --
> > > Carlos Maiolino
> >
> > Regards,
> > Srikanth




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux