System runs latest FC5 x86-64 kernel (2.6.17-1.2139_FC5) System might suddenly hang hard or reboot. Seems to blurt sporadic errors to syslog. Console messages: Message from syslogd@m1 at Thu Jun 29 00:56:55 2006 ... m1 kernel: Oops: 0000 [1] SMP Message from syslogd@m1 at Thu Jun 29 00:56:55 2006 ... m1 kernel: CR2: 0000000000000020 Dmesg shows this oops: Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: <ffffffff8022021c>{copy_process+3132} PGD 56690067 PUD 54fbe067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /block/hdd/removable CPU 0 Modules linked in: ipv6 autofs4 dm_mirror dm_mod video button battery acpi_memhotplug ac lp parport_pc parport sg i2c_nforce2 forcedeth floppy i2c_core raid1 ext3 jbd sata_nv libata sd_mod scsi_mod Pid: 27939, comm: get-errors.sh Not tainted 2.6.17-1.2139_FC5 #1 RIP: 0010:[<ffffffff8022021c>] <ffffffff8022021c>{copy_process+3132} RSP: 0018:ffff810056239d78 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff810056bfa700 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff81005642c648 RDI: ffff81007ac52420 RBP: ffff81007ac52420 R08: ffff81005593a000 R09: 00000000000559e1 R10: 0000000000000000 R11: 0000000000000001 R12: ffff81007dccf0c0 R13: ffff810037e6e400 R14: ffff81005642c590 R15: ffff81007b5be080 FS: 00002aaaaaab2d50(0000) GS:ffffffff8069c000(0000) knlGS:00000000f7e45b60 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000058a1e000 CR4: 00000000000006e0 Process get-errors.sh (pid: 27939, threadinfo ffff810056238000, task ffff8100639980c0) Stack: 00000000ffffffff 00002aaaaaab2de0 0000000000000000 ffff810056239f58 00007fffbe0c1fd0 0000000001200011 0000000000000000 ffff81007dccf0c0 ffff810037e6e400 ffff81007ac522c8 Call Trace: <ffffffff8024ade2>{sprintf+81} <ffffffff8026a183>{_spin_unlock_irq+9} <ffffffff802331d7>{do_fork+208} <ffffffff802697db>{__mutex_lock_slowpath+868} <ffffffff80254017>{do_pipe+610} <ffffffff80269469>{__mutex_unlock_slowpath+522} <ffffffff8021454a>{generic_file_llseek+127} <ffffffff80262d8e>{system_call+126} <ffffffff8026309b>{ptregscall_common+103} Code: 48 8b 40 20 f0 ff 43 28 f6 45 29 08 74 07 f0 ff 88 34 03 00 RIP <ffffffff8022021c>{copy_process+3132} RSP <ffff810056239d78> CR2: 0000000000000020 get-errors.sh runs smartctl on both existing and nonexisting disks. It seems to have hung on hdd. hdd: CDU5211, ATAPI CD/DVD-ROM drive hdd: ATAPI 52X CD-ROM drive, 120kB Cache, UDMA(33) relevant part of get-errors.sh: ----- for device in "hda" "hdb" "hdc" "hdd" "hde" "hdf" do exists=0 smartctl -i /dev/$device | grep -i support | grep -ci enabled > $tempfile1 exists=`cat $tempfile1` # Continus only if SMART support is enabled if [ "$exists" != "0" ]; then smartctl -H /dev/$device > $tempfile2 smartctl -c /dev/$device >> $tempfile2 errtmp=`grep -ci PASSED $tempfile2` # Continue only if the disk reports a temperature if [ "$errtmp" == "0" ]; then echo "SMART: /dev/$device failed smart tests" >> $logfile smarterr=$smarterr+1 fi fi done ----- The system has rebootet twicein 3 days, this time it did not reboot but some processes have hung. Running ps ax was a bad idea, as that hung too. crash utility just fails with the following error: crash: cannot resolve "cpu_pda" For more info just ask. I might reboot it but I'm sure it's not going to be far between crashes. -HK -- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list