On Wed, Sep 10, 2003 at 10:27:15PM +0200, Norbert Preining wrote: > > I think the most important point is that I had to clear the entire inode > 8 to get as far as fsck would *not* go into an endless (at least very > long) loop restarting itself. Thus, the journal was destroyed, on next > mount the fs was mounted as ext2, then I could do fsck which gave really > loads of error message I know (duplicate blocks, etc, all I have ever > seen ;-), but at the end it suceeded. > What version of e2fsck are you using? The newer versions of e2fsck have more checks that should offer to clear the journal inode (so you don't have to do this manually via debugfs). Of course, there might be some bugs still, but do please try the latest e2fsck first. (The latest e2fsck has a lot of bug fixes and improvements, so I strongly recommend upgrading; see the Release Notes for more information.) If someone finds a filesystem where e2fsck doesn't offer to clear the journal, I would be very interested in getting a compressed raw e2image dump file of the filesystem so I can reproduce it and create a test case. Similarly, if you can find a test case case where (IN THE ABSENCE OF HARDWARE ERRORS) where a single run of e2fsck is not capable of fixing all of the filesystems, I also want to know about. (That is, if you run e2fsck -f from the command line, and it fixes some errors, and then you run e2fsck with the -f option a second time, it should not find any further errors. If it does, by definition there is a bug in e2fsck, and I want to know about it.) If you find such a case, at minimum I would apprecate getting a full transcript of the e2fsck output, and preferably, a compressed raw e2image dump before the first e2fsck run. Because of the "in the absence of hardware errors" caveat, this is why it's nice to have a compressed raw e2image dump file is so important. This way we can uncompress the filesystem metadata on another hard disk, and try to replicate the problem. If we can't replicate it, then it's likely caused by a hardware problem or a device driver problem, such that two reads from a single block result in different results, or a read, write, read sequence to a block doesn't result in reading the same data which was written. E2fsck fundamentally assumes that the device driver, disk controller, and disk drive are sane, and that data written stays written, and data read at one time stays the same until modify by an intervening write. If these assumptions are violated, all guarantees are off. > Hmm. Then when are these error messages about > journal aborted > or something similar from, when I booted 2.6.0-test5, while with test4 > it was working. If the filesystem code detects a problem, which can be caused by a filesystem inconsistency on disk, or a hardware error, or a device driver problem of some sort, then the filesystem throws an error. What happens at this point depends on how the filesystem is configured. The filesystem can be told to ignore the error ("don't worry, be happy") and just continue on. The filesystem set so that if a filesystem inconsistency has been detected, the system can be forced to panic and reboot. Finally, the filesystem can be mounted read-only. In that case, journal writes are stopped (which is the source of the journal aborted message), and the filesystem is remounted read-only. There are generally other messages before the journal aborted message, which indicate what is really going on. - Ted _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users