Re: Check journal is replayable ?

John Vickers <jvickers@dial.pipex.com> · Tue, 03 Dec 2002 11:35:12 +0000

Hello again, and thankyou for your reply.

Andreas Dilger wrote:
> 
> On Dec 02, 2002  17:10 +0000, John Vickers wrote:
> > Is there a simple way, at a shell script level, of finding out whether an
> > ext3 fs has a sane journal, other than mounting it or running a full fsck ?
> 
> Yes, "tune2fs -l <dev> | grep 'features:.*needs_recovery'", but reading
> further you do not actually need it.

OK.  Thankyou.

> > I may quite well be missing a few things here, but what I think I'd like is
> > some option extra to e2fsck that says "if this is a journalled filesystem,
> > and it was shut down uncleanly, just replay the journal and check for
> > immediately obvious problems, but don't bother scanning the whole filesystem
> > unless there's a '-f' in sight".
> 
> That is how e2fsck already works, no need to change anything.

Aha.  I have e2fsprogs 1.27, and from bitkeeper it kindof looks like
this was documented from 1.28:

[john@cherry e2fsck]$ bk annotate -admN e2fsck.8.in | more
[...]
97/04/26 1.1      37  | .I device
97/04/26 1.1      38  | .SH DESCRIPTION
97/04/26 1.1      39  | .B e2fsck
02/08/17 1.19     40  | is used to check a Linux second extended file system (ext2fs).
02/08/17 1.19     41  | .B E2fsck
02/08/17 1.19     42  | also
00/08/14 1.11     43  | supports ext2 filesystems countaining a journal, which are
02/08/17 1.19     44  | also sometimes known as ext3 filesystems, by first applying the journal
02/08/17 1.19     45  | to the filesystem before continuing with normal
02/08/17 1.19     46  | .B e2fsck
02/08/17 1.19     47  | processing.  After the journal has been applied, a filesystem will
02/08/17 1.19     48  | normally be marked as clean.  Hence, for ext3 filesystems,
02/08/17 1.19     49  | .B e2fsck
02/08/17 1.19     50  | will normally run the journal and exit, unless its superblock
02/08/17 1.19     51  | indicates that further checking is required.
01/11/24 1.14     52  | .PP
97/04/26 1.1      53  | .I device
[...]
02/07/25 1.17    149  | .TP
99/10/21 1.9     150  | .B \-f
97/04/26 1.1     151  | Force checking even if the file system seems clean.
97/04/26 1.1     152  | .TP

I still don't find it very clear: "normal" and "normally" beg the question
"What are the /abnormal/ cases ?"   The doc could be explicit about
the effect of "-f" on ext3.  Whatever.

> By default
> it will replay the journal, and then check the superblock for errors.  If
> no error is marked in the superblock, it is done in a second or so[*].

> Just doing this with the above script isn't enough, since errors can also
> be stored in the journal header in case of very serious errors, and the
> un-recovered filesystem superblock will _appear_ to be fine, but the
> filesystem really needs a full check.
> 
> [*] There is also a feature of ext2/3 that allows you to specify full
>     filesystem checks after a certain number of mounts/time.  Some
>     people turn this off in the mistaken thought that "it has a journal,
>     I don't need no stinking fsck on my filesystems".

>     However, a journal
>     is no protection against disk, memory, CPU, or kernel errors, so doing
>     periodic full fscks can help find errors before they cause cascading
>     data corruption on your filesystem, or get detected right in the middle
>     of some important work and make your system unusable.

Indeed.

>     If you don't like
>     the "every 20 mounts" full fsck, change it with "tune2fs -c" to be some
>     longer interval.

...or put a note in your diary to run fsck on the last friday of the month
or something.

If we run full fsck just like we did on ext2, we are less likely to lose data,
than we were on ext2, but we don't get the fast recovery advantages of ext3,
which is a big selling point of ext3.   What to do with "tune2fs -c" seems to
depend on priorities & applications.

I guess for filesystem validation with a "live" filesystem there's your
LVM snapshot solution.

I don't know if anyone's attempted to quantify the risk of losing data
as a function of how often fsck is run.

> > AFAICT, the usual way of handling ext3 filesystems seems to be to mark them
> > with fs_passno=0, so they never get fscked from the init scripts - but the
> > journal gets replayed, and a few things get checked at mount time.
> 
> That is just plain wrong, since it will skip full checking if there was an
> error detected in the filesystem.

I think I've seen this suggested on some of the forums.

> > If mount fails - because something horrible really did happen - then the
> > /etc/rc.sysinit doesn't seem to have any way of coping, or dropping to an
> > interactive shell.
> 
> That's why you should have passno != 0 for all ext3 filesystems, so that
> e2fsck has a chance to check the superblock before the filesystem is
> mounted.

Regards,

John.

_______________________________________________

Ext3-users@redhat.com
https://listman.redhat.com/mailman/listinfo/ext3-users