Re: Check journal is replayable ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi.

Thankyou for your replies.


"Stephen C. Tweedie" wrote:
> 
> On Mon, 2002-12-02 at 17:10, John Vickers wrote:
> 
> > Is there a simple way, at a shell script level, of finding out whether an ext3 fs
> > has a sane journal, other than mounting it or running a full fsck ?
> 
> Define a "sane journal"?

Well, I think /my/ definition of "Sanity" would be fairly liberal ;-)

Something like having header blocks in a valid format that don't point
outside the fs,
and not having hard read errors.

If it had been the filesystem's considered intention to commit suicide
by zeroing
its own superblocks and inodes, I don't think it's necessarily
appropriate for
society to stand in its way...

> The journal just contains copies of disk blocks.  It's nothing more than
> a list of "here's a copy of the new block number FOO of the
> filesystem."  And the journal is *supposed* to contain gaps after an
> unexpected reboot

> --- it's by looking for missing bits that we work out
> just how much of the journal did get successfully written out to disk
> when things crashed.

Aha.

> In other words, the journal is really really dumb, and there's next to
> no validation you can sensibly do on its contents without invoking a lot
> of filesystem layout knowledge (and at that point you're into full fsck
> territory.)

Yup.

> > AFAICT, the usual way of handling ext3 filesystems seems to be to mark them with fs_passno=0,
> > so they never get fscked from the init scripts - but the journal gets replayed, and a few things
> > get checked at mount time.
 
> No, you should give them a valid pass number to force fsck to run,

Right.

> but
> when fsck sees an ext3 filesystem needing recovery, it skips the full
> check and just does the recovery stage.

Hmm.  I wasn't seeing this.  Maybe my e2fsprogs aren't up to date,
or I was trying to simulate it the wrong way.  This is wnat I did
( /dev/hdd1 is ext3; I inserted line breaks for readibility):

[root@cherry etc]# cat /etc/issue
Mandrake Linux release 9.0 (dolphin) for i586
Kernel 2.4.19-16mdk on an i686 / \l

[root@cherry etc]# tune2fs -i 0 -c 0 /dev/hdd1
tune2fs 1.27ea (14-Mar-2002)
Setting maximal mount count to -1
Setting interval between check 0 seconds

[root@cherry etc]# debugfs -w /dev/hdd1
debugfs 1.27ea (14-Mar-2002)
debugfs:  dirty
debugfs:  q

[root@cherry etc]# time fsck /dev/hdd1
fsck 1.27ea (14-Mar-2002)
e2fsck 1.27ea (14-Mar-2002)
projects1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
projects1: 11/3940832 files (0.0% non-contiguous), 131873/7866802 blocks
Command exited with non-zero status 1
2.00user 3.11system 0:33.58elapsed 15%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (290major+1843minor)pagefaults 0swaps

[root@cherry etc]# time fsck /dev/hdd1
fsck 1.27ea (14-Mar-2002)
e2fsck 1.27ea (14-Mar-2002)
projects1: clean, 11/3940832 files, 131873/7866802 blocks
0.00user 0.03system 0:00.08elapsed 35%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (273major+64minor)pagefaults 0swaps

[root@cherry etc]#

 
> You still want the fsck to run because in case of a filesystem error
> being detected at run time, the kernel can mark the partition as having
> an error, and the subsequent fsck can pick that up and force a full fsck
> to fix it.

Great.  I guess that slightly leaves open the best policy if an
inconsistency
/is/ discovered at run-time.

I guess I kinda want the thing to start yelling "Help! Fsck Me Now!" and
waving
a flag or something.

But not necessarily to immediately bring down the entire system with a
panic,
since there may still be quite enough system left on other partitions to
run e2fsck.

That seems to lead to: "error-behaviour=remount-ro".  Make any sense ?



> That mechanism fails if you set the pass number to zero.

Yes.

> You can disable forced fscks while preserving that error-recovery
> behaviour by leaving the passno intact but setting the fsck mount-count
> and check-intervals to zero with tune2fs.

Regards,

John.



_______________________________________________

Ext3-users@redhat.com
https://listman.redhat.com/mailman/listinfo/ext3-users

[Index of Archives]         [Linux RAID]     [Kernel Development]     [Red Hat Install]     [Video 4 Linux]     [Postgresql]     [Fedora]     [Gimp]     [Yosemite News]

  Powered by Linux