Re: fsck failure at boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Apr 21, 2006, at 6:32 PM, Herta Van den Eynde wrote:

Well, the relevance of SANsurfer would depend on what the problem is. When groping in the dark, it'd be one of the places I'd look for indications. But upon re-reading your initial post, I agree that chances are slim that the HBA is the root cause of your problem.

You mentioned RHAS4. Are you using a standard Red Hat kernel, or did you built your own? (Reason I ask is that I want to exclude an initial ram disk that doesn't know about your QLogic HBA.)

This is a standard RHAS 2.6.* ia64 kernel.

I'm a bit confused by the "*** An error occurred during the file system check" error message you mentioned in your first mail. I expect that to be generated by /etc/rc.d/rc.sysinit, not by fsck.ext3. (Might be a cut-n-paste to the wrong portion of the mail body?) Note that there are two locations in that script that can generate that error: once while the root filesystem is mounted read-only, and again after lvm2 initialization.

It was copy/pasted straight out of the console. It *is* possible that there is a paste error, as I grabbed sections at a time.

The complaint about the superblock problem can be ignored, in as far as the superblock must be correct - as is evident from the fact that you can mount the partition just fine when the system is fully booted. (Assuming that /dev/sdl1 doesn exist, a "fsck.ext3 -a /dev/sdl1" will generate the same error.)

[root@altix ~]# fsck.ext3 -a /dev/sdl1
fsck.ext3: No such file or directory while trying to open /dev/sdl1
/dev/sdl1:
The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

But combined with the error "fsck.ext3: No such file or directory while trying to open /dev/sdb1", it looks like the device special filename /dev/sdb1 hasn't been created yet at the time you're trying to use it.

Do dmesg or /var/log/messages contain additional information?

The dmesg is at . Here is a seemingly relevant section from /var/log/ messages. I'm not sure why the timestamps are all jumbled, but assuming the entries are logged sequentially, udev certainly appears to load the device before fsck.ext3 is called. Note, this is for a boot with the /dev/sdb1 entry commented out in /etc/fstab.

Apr 20 19:32:24 altix kernel: SELinux:  Initializing.
Apr 20 19:32:24 altix kernel: SELinux:  Starting in permissive mode
Apr 20 19:32:24 altix kernel: There is already a security framework initialized, register_security failed.
Apr 20 15:32:10 altix start_udev: Starting udev:  succeeded
Apr 20 19:32:25 altix kernel: selinux_register_security: Registering secondary module capability
Apr 20 15:32:13 altix udevsend[704]: starting udevd daemon
Apr 20 19:32:25 altix kernel: Capability LSM initialized as secondary
Apr 20 15:32:14 altix scsi.agent[753]: disk at /devices/ pci0000:02/0000:02:01.0/host4/target4:0:0/4:0:0:0 Apr 20 19:32:25 altix kernel: Mount-cache hash table entries: 1024 (order: 0, 16384 bytes) Apr 20 15:32:14 altix scsi.agent[760]: disk at /devices/ pci0000:02/0000:02:01.0/host4/target4:0:0/4:0:0:1
Apr 20 19:32:25 altix kernel: Boot processor id 0x0/0x0
Apr 20 15:32:16 altix rc.sysinit: -e
Apr 20 19:32:25 altix kernel: task migration cache decay timeout: 10 msecs.
Apr 20 19:32:25 altix rpcidmapd: rpc.idmapd startup succeeded
Apr 20 15:32:16 altix sysctl: net.ipv4.ip_forward = 0
Apr 20 15:32:16 altix sysctl: net.ipv4.conf.default.rp_filter = 1
Apr 20 19:32:25 altix netfs: Mounting other filesystems:  succeeded
Apr 20 15:32:16 altix sysctl: net.ipv4.conf.default.accept_source_route = 0
Apr 20 15:32:16 altix sysctl: kernel.sysrq = 0
Apr 20 19:32:26 altix autofs: automount startup succeeded
Apr 20 15:32:16 altix sysctl: kernel.core_uses_pid = 1
Apr 20 15:32:16 altix rc.sysinit: Configuring kernel parameters: succeeded Apr 20 19:32:26 altix smartd[1386]: smartd version 5.33 [ia64-redhat- linux-gnu] Copyright (C) 2002-4 Bruce Allen
Apr 20 19:32:16 altix date: Thu Apr 20 19:32:16 EDT 2006
Apr 20 19:32:16 altix rc.sysinit: Setting clock (localtime): Thu Apr 20 19:32:16 EDT 2006 succeeded Apr 20 19:32:26 altix smartd[1386]: Home page is http:// smartmontools.sourceforge.net/ Apr 20 19:32:27 altix smartd[1386]: Opened configuration file /etc/ smartd.conf
Apr 20 19:32:26 altix kernel: Brought up 4 CPUs
Apr 20 19:32:16 altix rc.sysinit: Setting hostname altix.raba.com: succeeded Apr 20 19:32:27 altix smartd[1386]: Configuration file /etc/ smartd.conf parsed. Apr 20 19:32:17 altix fsck: [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a / dev/VolGroup00/LogVol00 Apr 20 19:32:17 altix fsck: /dev/VolGroup00/LogVol00: clean, 293377/9781248 files, 2273903/19546112 blocks


Is this a system you can take down for testing?  If so, could you
- edit rc.sysinit to slightly change one of the two "*** An error occurred during the file system check" error messages, to determine which of the two locations actually causes the error?
- reboot again, and when you're dropped to the shell,
  - manually check whether the device special file
    /dev/sdb1 exists or not
  - manually execute the checks in rc.sysinit prior
    to the error message to determine which one fails

Yes, I should be able to attempt this on Monday. I will follow up with details.

Thanks,

--
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net



--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list

[Index of Archives]     [CentOS]     [Kernel Development]     [PAM]     [Fedora Users]     [Red Hat Development]     [Big List of Linux Books]     [Linux Admin]     [Gimp]     [Asterisk PBX]     [Yosemite News]     [Red Hat Crash Utility]


  Powered by Linux