Hi,
After upgrading kernel, Edward (on CC) is having trouble booting up.
The kernel hangs after reporting it can't mount root on /dev/sda3, which is
supposed to be a partition on a SATA disk, connected to a sata_nv controller.
As serial console is not available, we stripped down the kernel in hope that
the SATA disk detection would appear on the same screen so that it could be
caught on camera :)
After removing the more verbose parts of the kernel (USB, ACPI, etc) in
attempt to get disk detection messages on the same screen, we ran into another
issue. The kernel oops's on boot up, and tries to kill init. So its not even
getting as far this time (last time, it got all the way to trying to mount root).
This problem did not exist in 2.6.10 which can still be booted right now. It
is reprocable on both 2.6.12 and 2.6.13-rc3.
Under 2.6.10, these messages appear during disk detection:
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xF200 irq 23
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xF208 irq 23
ata1: dev 0 cfg 49:2f00 82:7c6b 83:7b09 84:4003 85:7c69 86:3a01 87:4003 88:407f
ata1: dev 0 ATA, max UDMA/133, 240121728 sectors:
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device removed
ata1: dev 0 configured for UDMA/133
scsi0 : sata_nv
ata2: no device found (phy stat 00000000)
scsi1 : sata_nv
Vendor: ATA Model: Maxtor 6Y120M0 Rev: YAR5
Type: Direct-Access ANSI SCSI revision: 05
st: Version 20041025, fixed bufsize 32768, s/g segs 256
SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)
SCSI device sda: drive cache: write back
/dev/scsi/host0/bus0/target0/lun0: p1 p2 p3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0
Under a minimal 2.6.13-rc3, this happens instead:
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xF200 irq 0
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xF208 irq 0
Unable to handle kernel NULL pointer dereference at <...> RIP:
sysfs_hash_and_remove+16
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid 1, comm: swapper Not tainted 2.6.13-rc3
RIP: sysfs_hash_and_remove+16
Call trace:
class_device_del class_device_unregister
scsi_remove_host setup_irq
request_irq ata_host_remove
ata_device_add pci_conf1_write
pcibios_set_master nv_init_one
pci_device_probe driver_probe_device
__driver_attach __driver_attach
__driver_attach bus_for_each_dev
bus_add_driver pci_register_driver
init child_rip
init child_rip
I can provide a full jpeg if required.
It looks like dir->d_inode is null, although I don't have much idea where the
real bug exists.
(gdb) list *sysfs_hash_and_remove+16
0x5b0 is in sysfs_hash_and_remove (semaphore.h:107).
102 * This is ugly, but we want the default case to fall through.
103 * "__down_failed" is a special asm handler that calls the C
104 * routine that actually waits. See arch/x86_64/kernel/semaphore.c
105 */
106 static inline void down(struct semaphore * sem)
107 {
108 might_sleep();
109
110 __asm__ __volatile__(
111 "# atomic down operation\n\t"
Any ideas or suggestions?
Thanks,
Daniel
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html