Re: [Novalug] My weekend is RUINED!!! =(((

Megan Larko <larkoc@xxxxxxxx> · Fri, 05 Mar 2010 18:34:06 -0500

Alan Grimes wrote:

Okay.  Let's backup a little bit here.

Fixed memory problem on company server.  It rebooted just fine after the memory adjustment.
Then did you bring the company server again to install the IDE hard drive using Cable Select?
Was this an additional boot after which the machine did not come up?  Is there the option of 
un-doing the most-recent drive installation?  The new disk you added was to be independent of the 
two HD in the software RAID controller?  Is it possible that the newly added disk got itself 
detected as a part of the previously existing RAID and then the RAID did whatever it was the RAID 
was supposed to do?   Example: the new disk was detected as a part of the RAID and then the RAID sw 
did either its 0 or 1 or 10 or 5 or whatever behavior for which it is set?

I would suggest first disconnecting the most-recently added disk and bring the box back up.  I know 
udev is supposed to reduce how much the end user needs to know about the physical hard devices in 
the system.   I think that is a good thing generally.   I used to have to look at a text file I 
would put onto all of my external RAID towers under old Red Hat 7 and 8 configs because if a disk 
dropped off line because of excessive scsi errors then all of the other devices would move up one 
slot.   The fstab "LABEL" elminated that specific issue.  The udev is supposed to go a step further 
in my understanding in that it reads the UUID serial number from the disk and sets that string into 
the fstab file as the device to be mounted at a specified mount point.  I did personally make a 
"note" in my fstab file about which partition is which when I pulled a drive to retrieve data and I 
needed to know how to map the mount points.

Example from my current /etc/fstab file on Ubuntu 9.04:
# /dev/sda1
UUID=614e87dd-dd36-4026-9127-59ae0942f9f3 /               ext3    relatime,errors=remount-ro 0       1
# /dev/sda5
UUID=1067d6b1-dda9-4e9c-9f39-bd7a77357676 none            swap    sw              0       0

I duplicated the udev UUID line and created a comment line in fstab which how it mapped in my 
system.  I use the UUID line (until the hard disk corrupts and I have to pull the drive and place it 
 on a cable and just mount what I can and get back what I can.   I call my note cheap insurance.

Anyway, can you disconnect the most-recently installed drive and then reboot as it was before?  If 
not, can you boot from a live DVD or live USB and check out the disks from there to see if any 
setting has changed?

I agree just adding a disk should not confuse a box.

megan

You know, it actually occured to me the other day to publish a list of
software that I actually liked, instead of ranting about linux again. I
was starting to draft that message in my head, might even have written
it... Then I was planning to spend some quality time with my project car
and play a game called Star Ocean that I picked up the other day..
(That's as good as weekends get in my neck of the woods). But then this
morning happened.

Yesterday evening the server at my company went down. (problem with the
contacts in the memory sockets, the machine seems to date from the
2002-2003 era.) I took the opportunity to rip out the ancient SCSI CD-RW
drive and put in a modern IDE drive.

Because I had spent a great deal of quality time with the machine last
time it went down, it booted just fine after the memory issue was resolved.

So I go into work this morning, and log into it to see how the new drive
was doing. It was connected to the second controller, in CS mode on a
standard 80-pin cable even though the kernel didn't detect the cable
correctly, possibly some other glitch...

I would have looked a little further into this part, personally.

There are two HDs in software RAID on the first controller.

So I go into /dev expecting to find hdc and, possibly a symlink to CDROM...

(My local machine -- gentoo -- is currently calling my only optical
drive "cdrom3" and "dvd3" -- who knows?)

Well, on the server there was bubkis,

no hda, no hdb, no hdc, no md1, no md2, no md3 (even though that's what
was in fstab!!!)

I know that udev is supposed to create those so I started googling.

Google doesn't seem to know anything about this. The udev website hasn't
been updated in 5 years (even though the package itself is rewritten
every 20 minutes...)

From my local config.
#############################
Emerging (1 of 1) sys-fs/udev-151-r1
 * udev-151.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...

                     [ ok ]
 * checking ebuild checksums ;-) ...

                     [ ok ]
 * checking auxfile checksums ;-) ...

                     [ ok ]
 * checking miscfile checksums ;-) ...

                     [ ok ]
 * CPV:  sys-fs/udev-151-r1
 * REPO: gentoo
 * USE:  devfs-compat elibc_glibc extras kernel_linux old-hd-rules
userland_GNU x86
 * Determining the location of the kernel source code
 * Found kernel source directory:
 *     /usr/src/linux
 * Found kernel object directory:
 *     /lib/modules/2.6.33/build
 * Found sources for kernel version:
 *     2.6.33
 * Checking for suitable kernel configuration options...
 *   CONFIG_IDE:         should not be set. But it is.
 * Please check to make sure these options are set correctly.
 * Failure to do so may cause unexpected problems.
 *
 * udev-151 does not support Linux kernel before version 2.6.25!
 * For a reliable udev, use at least kernel 2.6.27

 * Your kernel version (2.6.33) is new enough to run udev-151 reliably.
##############################

I have not changed my IDE settings in many many years *BECAUSE MY
COMPUTER BASICALLY WORKS*. I have two drives on my first IDE controller
(home PC). What the hell is going on here??? I'm still seeing my full
drive compliment but only because my machine has been running for many
days so it is basically running on inertia. -- IT PROBABLY WON'T WORK IF
I REBOOT IT.

The server is currently so foobar that it can't even mount its boot
partition to install a new kernel.

It's UDEV barf is:

#####################
Emerging (1 of 1) sys-fs/udev-151-r1
 * udev-151.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...

                     [ ok ]
 * checking ebuild checksums ;-) ...

                     [ ok ]
 * checking auxfile checksums ;-) ...

                     [ ok ]
 * checking miscfile checksums ;-) ...

                     [ ok ]
 * CPV:  sys-fs/udev-151-r1
 * REPO: gentoo
 * USE:  devfs-compat elibc_glibc kernel_linux old-hd-rules userland_GNU x86
 * Determining the location of the kernel source code
 * Found kernel source directory:
 *     /usr/src/linux
 * Found sources for kernel version:
 *     2.6.30.5
 * Checking for suitable kernel configuration options...
 *   CONFIG_SYSFS_DEPRECATED:    should not be set. But it is.
 *   CONFIG_SYSFS_DEPRECATED_V2:         should not be set. But it is.
 *   CONFIG_IDE:         should not be set. But it is.
 * Please check to make sure these options are set correctly.
 * Failure to do so may cause unexpected problems.
 *
 * udev-151 does not support Linux kernel before version 2.6.25!
 * For a reliable udev, use at least kernel 2.6.27

 * Your kernel version (2.6.30.5) is new enough to run udev-151 reliably.

### AND ###

 * Messages for package sys-fs/udev-151-r1:

 *   CONFIG_IDE:         should not be set. But it is.
 * Please check to make sure these options are set correctly.
 * Failure to do so may cause unexpected problems.
 *
 * udev-151 does not support Linux kernel before version 2.6.25!
 * For a reliable udev, use at least kernel 2.6.27
 *
 * Updating persistent-net rules file
 *
 * restarting udevd now.
 *
 * If after the udev update removable devices or CD/DVD drives
 * stop working, try re-emerging HAL before filling a bug report
 *
 * persistent-net does assigning fixed names to network devices.
 * If you have problems with the persistent-net rules,
 * just delete the rules file
 *      rm /etc/udev/rules.d/70-persistent-net.rules
 * and then reboot.
 *
 * This may however number your devices in a different way than they are
now.
 *
 * If you build an initramfs including udev, then please
 * make sure that the /sbin/udevadm binary gets included,
 * and your scripts changed to use it,as it replaces the
 * old helper apps udevinfo, udevtrigger, ...
 *
 * mount options for directory /dev are no longer
 * set in /etc/udev/udev.conf, but in /etc/fstab
 * as for other directories.
 *
 * devfs-compat use flag is enabled (by default).
 * This enables devfs compatible device names.
 * If you use /dev/md/*, /dev/loop/* or /dev/rd/*,
 * then please migrate over to using the device names
 * /dev/md*, /dev/loop* and /dev/ram*.
 * The devfs-compat rules will be removed in the future.
 * For reference see Bug #269359.
 *
 * old-hd-rules use flag is enabled (by default).
 * This adds the removed rules for /dev/hd* devices
 * Please migrate to the new libata.
 * These rules will be removed in the future
 *
 * For more information on udev on Gentoo, writing udev rules, and
 *          fixing known issues visit:
 *          http://www.gentoo.org/doc/en/udev-guide.xml
Auto-cleaning packages...

###################

What the hell is that bit about HAL? Painful experience has shown that
xorg server is incompatible with HAL. It is impossible to achieve
acceptable reliability of my xorg server with HAL installed on the
system, so I don't.

When it comes to the keybord, mouse, and disk drives, no conceivable
failure mode is tolerable. By direct implication, I might have to
replace the operating systems on both my home PC and my company's server
with PC BSD or something if linux cannot be patched in such a way that
it is simply not possible for it to fail to configure it's primary hard
drives. =(

That part at the end, How can you possibly remove /dev/hda??? I mean
Minix had a pretty good system, the HD was (hd 0,1) etc... And that was
a good scheme. I mean I have not read news about this anywhere, It
didn't make slashdot, it's not on the udev website because it has never
been updated. Google has no answers, only other people's questions. Did
someone bother to write even a whitepaper before obliterating 15 years
of my linux experience? I mean I'm open to change, if you have a ***
BETTER *** system, then great! Make sure I know about it before you
break the old system.

Will my main desktop even boot if I power cycle it? On the server I
can't use any of the utilities because all the drives are invisible and
inaccessible. I'm completely helpless at this point. The system is still
running, Some of the volumes are mounted and serving pages but I cant
administer anything.

I mean I'm not even going to be able to get any sleep until I can have a
bronze cast garantee that, YES, Linux will allow me to access my
drives... With DOS, it was never ever a question, it would always work.
If I go boot the machine upstairs, if the drive spins at all, DOS will
be fully functional. Shoot me dead if it fails.

This is a 9-alarm fire, I need help from anyone to re-gain control of
both the server and my home-computer and restore sanity to the linux
developers. The disk drivers CANNOT FAIL, they're the lynchpin of the
entire computer.

Come to think of it, this is probably in the top-5 list of the most
spectacular operating system failures I've seen my entire life. This is
so insane its surreal.

--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html