On Wed, 2007-12-26 at 09:53 +0530, Chirag Jog wrote: > * sven@xxxxxxxxxxxxxxxxxxxxx <sven@xxxxxxxxxxxxxxxxxxxxx> [2007-12-24 13:00:30]: > > Hi, Sorry for top-posting the last email. Have not learned how to inline on the Blackberry. Comments below. > > Do you have DEBUG and LOCKDEP configured? > CONFIG_PREEMPT_DEBUG is enabled > but LOCKDEP is not. > > > > -----Original Message----- > > From: Sripathi Kodi <sripathik@xxxxxxxxxx> > > > > We are trying to get a ppc64 box booted with the -rt kernel. > > > > Tried the latest 2.6.24-rc5-rt1 kernel. > > > > Everything goes well [ kernel boots up etc] until the services > > start coming up. > > It takes a lot of time at Starting udev. > > After which either services take too long to start or we get > > SEGFAULT. I have seen this exact behavior with lockdep enabled on earlier x86-SMP Kernels. I saw the issue predominantly on larger SMP machines, but it behaved much like a race. This was observed on 2.6.21 Kernels, I will try and verify whether .23 has the same behavior. We eventually bisected it down to CONFIG_PROVE_LOCKING, but I wasn't totally convinced whethere the root cause was in lockdep, or elsewhere. > > > > This is easily reproducible. Also if we get a chance to login; > > simple commands like ls, vi etc take either too long or get > > SEGFAULT. > > > > Booting the kernel with maxcpus=1 or turning off SMP, doesn't > > solve the problem. > > Hmmm, so have you COMPILED-out SMP, or just booted with nosmp? It might help narrowing this down to compile a UP Kernel. Thanks Sven > > > Here is logs from one of the booting sequences: > > > > Linux version 2.6.24-rc5-rt1 (root@llm29) (gcc version 4.1.0 (SUSE Linux)) #1 > > SMP PREEMPT RT Thu Dec 20 13:41:45 IST 2007 [boot]0012 Setup Arch > > PPC64 nvram contains 16384 bytes > > Zone PFN ranges: > > DMA 0 -> 1024000 > > Normal 1024000 -> 1024000 > > Movable zone start PFN for each node > > early_node_map[1] active PFN ranges > > 0: 0 -> 1024000 > > [boot]0015 Setup Done > > Real-Time Preemption Support (C) 2004-2007 Ingo Molnar > > Built 1 zonelists in Node order, mobility grouping on. Total pages: 1000000 > > Policy zone: DMA > > Kernel command line: root=/dev/hda2 quiet sysrq=1 askmethod > > Starting udevd > > Creating devices > > Waiting for device /dev/hda2 to appear: ok > > rootfs: major=3 minor=2 devn=770 > > fsck 1.38 (30-Jun-2005) > > [/bin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/hda2 > > /dev/hda2: clean, 249426/1311552 files, 1860112/2622611 blocks > > fsck succeeded. Mounting root device read-write. > > Mounting root /dev/hda2 > > INIT: version 2.86 booting > > System Boot Control: Running /etc/init.d/boot > > Mounting procfs at /proc done > > Mounting sysfs at /sys done > > Mounting debugfs at /sys/kernel/debug done > > Initializing /dev done > > Mounting devpts at /dev/pts done > > Activating swap-devices in /etc/fstab... done > > showconsole: Warning: the ioctl TIOCGDEV is not known by the kernel > > Starting udevd udevd-event[2463]: run_program: '/bin/sh' abnormal exit > > udevd-event[2474]: run_program: '/sbin/vol_id' abnormal exit > > udevd-event[2477]: run_program: '/sbin/vol_id' abnormal exit > > udevd-event[2455]: run_program: '/sbin/ifup' abnormal exit > > udevd-event[2477]: run_program: '/lib/udev/mount.sh' abnormal exit > > > > > > > > > > > failed > > Loading required kernel modules done > > Activating device mapper... > > FATAL: Module dm_mod not found. > > failed > > Starting MD Raid done > > Waiting for udev to settle... > > Scanning for LVM volume groups... > > Reading all physical volumes. This may take a while... > > No volume groups found > > No volume groups found > > Activating LVM volume groups... > > No volume groups found > > done > > showconsole: Warning: the ioctl TIOCGDEV is not known by the kernel > > Checking file systems... > > fsck 1.38 (30-Jun-2005) > > Checking all file systems. > > [/sbin/fsck.ext3 (1) -- /home] fsck.ext3 -a /dev/hda5 > > /dev/hda5: clean, 460612/1836928 files, 2774880/3670844 blocks > > [/sbin/fsck.ext3 (1) -- /home/mohan/mnt] fsck.ext3 -a /dev/hda3 > > /dev/hda3: clean, 362719/1311552 files, 1651440/2622611 blocks > > *** glibc detected *** /bin/sh: free(): invalid pointer: 0x100a5170 *** > > ======= Backtrace: ========= > > /lib/power4/libc.so.6[0xfde4be4] > > /lib/p/lib/apparmor/rc.apparmor.functions: line 87: 2603 Segmentation fault > > grep -q securityfs /proc/filesystems > > /lib/apparmor/rc.apparmor.functions: line 355: 2606 Aborted > > grep -qE "^(subdomain|apparmor)[[:space:]]" /proc/modules > > /lib/apparmor/rc.apparmor.functions: line 318: 2610 Segmentation fault > > modinfo -F filename subdomain >/dev/null 2>&1 > > /lib/apparmor/rc.apparmor.functions: line 318: 2613 Segmentation fault > > grep -qE "^(subdomain|apparmor)[[:space:]]" /proc/modules FATAL: Module > > apparmor not found. > > Loading AppArmor module failed > > - could not start AppArmorfailed > > /etc/init.d/boot.cleanup: line 23: 2601 Segmentation fault rm -f > > /var/lib/rpm/__db* /etc/init.d/boot.cleanup: line 23: 2602 Segmentation > > fault rm -rf /tmp/screens /tmp/uscreens /var/run/screens > > /var/run/uscreens 2>/dev/null /etc/init.d/boot.cleanup: line 23: 2605 > > Segmentation fault rm -f /tmp/.X*lock /var/spool/uucp/LCK* > > /var/log/sa/sadc.LOCK /fsck_corrected_errors 2>/dev/null > > /etc/init.d/boot.cleanup: line 23: 2608 Segmentation fault find > > /tmp/ssh-* -type s -name "*agent*" -maxdepth 1 -print0 2>/dev/null 2609 > > | xargs -0 -r rm -f > > /etc/init.d/boot.cleanup: line 23: 2611 Segmentation fault find > > /var/run /var/lock -type f -print0 2612 | xargs -0 -r > > rm -f 2>/dev/null > > /etc/init.d/boot.cleanup: line 23: 2615 Segmentation fault chmod 664 > > /var/run/utmp /etc/init.d/boot.cleanup: line 23: 2616 Segmentation fault > > chown root:tty /var/run/utmp /etc/init.d/boot.cleanup: line 23: 2617 > > Segmentation fault ls /etc/resolv.conf.saved.by.* >&/dev/null > > /etc/init.d/boot.cleanup: line 75: 2618 Segmentation fault chown > > root:root $CURDIR /etc/init.d/boot^[[6~.cleanup: line 75: 2619 Segmentation > > fault chown root:root $CURDIR /etc/init.d/boot.cleanup: line 75: 2620 > > Segmentation fault chown root:root $CURDIR /etc/init.d/boot.cleanup: > > line 75: 2621 Segmentation fault chown root:root $CURDIR > > /etc/init.d/boot.cleanup: line 75: 2622 Segmentation fault chown > > root:root $CURDIR /etc/init.d/boot.cleanup: line 75: 2623 Segmentation > > fault c^[[6~hown root:root $CURDIR /etc/init.d/boot.cleanup: line 82: > > 2624 Segmentation fault chown root:root $CURDIR > > /etc/init.d/boot.cleanup: line 23: 2625 Segmentation fault rm -f > > /etc/psdevtab /etc/init.d/boot.cleanup: line 23: 2626 Segmentation fault > > /bin/ps >/dev/null 2>/dev/null No such file or directory > > No such file or directory > > No such file or directory > > No such file or directory > > No such file or directory > > No such file or directory > > System Boot Control: The system has been set up > > Failed features: boot.udev boot.localfs boot.clock boot.klog boot.scpm > > boot.crypto boot.localnet boot.swap boot.udev_retry boot.ldconfig > > boot.sysctl boot.ipconfig System Boot Control: Running > > /etc/init.d/boot.local done INIT: Entering runlevel: 3 > > EXT3-fs error (device hda2): ext3_add_entry: bad entry in directory #971592: > > rec_len % 4 != 0 - offset=0, inode=9349464, rec_len=54811, name_len=129 > > blogd: can not open /var/run/blogd.pid: Input/output error > > Boot logging started on /dev/hvc0(/dev/console) at Thu Dec 20 14:19:34 2007 > > Master Resource Control: previous runlevel: N, switching to runlevel: 3 > > Master Resource Control: runlevel 3 has been reached > > /etc/init.d/rc: line 383: 2669 Illegal instruction killproc -QUIT > > /sbin/blogd Segmentation fault > > /bin/bash: error while loading shared libraries: �C�x��H: cannot > > open shared object file: No such file or directory INIT: Id "cons" > > respawning too fast: disabled for 5 minutes > > > > CPU Info: > > # cat /proc/cpuinfo > > processor : 0 > > cpu : PPC970, altivec supported > > clock : 1600.000000MHz > > revision : 2.2 (pvr 0039 0202) > > > > processor : 1 > > cpu : PPC970, altivec supported > > clock : 1600.000000MHz > > revision : 2.2 (pvr 0039 0202) > > > > timebase : 199836440 > > platform : pSeries > > machine : CHRP IBM,8842-21X > > > > Also Attached is the config file: > > > > -- > > Cheers, > > Chirag Jog > > > > > > -- > > Cheers, > > Chirag Jog > > > > ------------------------------------------------------- > > > - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html