Hello, Cliff. Cliff Wickman wrote: > I've run into a problem with the ATA SCSI disk driver when running in a > kdump dump-capture kernel. > > I'm running on 2-processor x86_64 box. It has 2 scsi disks, /dev/sda and > /dev/sdb > > My kernel is 2.6.22, and built to be a dump capturing kernel loaded by kexec. > When I boot this kernel by itself, it finds both sda and sdb. > > But when it is loaded by kexec and booted on a panic it only finds sda. > > Any ideas from those familiar with the ATA driver? > > > -Cliff Wickman > SGI > > > I put some printk's into it and get this: > > Standalone: > > [nv_adma_error_handler] > cpw: ata_host_register probe port 1 (error_handler:ffffffff81348625) > cpw: ata_host_register call ata_port_probe > cpw: ata_host_register call ata_port_schedule > cpw: ata_host_register call ata_port_wait_eh > cpw: ata_port_wait_eh entered > cpw: ata_port_wait_eh, preparing to wait > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > cpw: ata_dev_configure entered > cpw: ata_dev_configure testing class > cpw: ata_dev_configure class is ATA_DEV_ATA > ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133 > ata2.00: 390721968 sectors, multi 16: LBA48 > cpw: ata_dev_configure exiting > cpw: ata_dev_configure entered > cpw: ata_dev_configure testing class > cpw: ata_dev_configure class is ATA_DEV_ATA > cpw: ata_dev_configure exiting > cpw: ata_dev_set_mode printing: > ata2.00: configured for UDMA/133 > cpw: ata_port_wait_eh, finished wait > cpw: ata_port_wait_eh exiting > cpw: ata_host_register done with probe port 1 > > > When loaded with kexec and booted on a panic: > > cpw: ata_host_register probe port 1 (error_handler:ffffffff81348625) > cpw: ata_host_register call ata_port_probe > cpw: ata_host_register call ata_port_schedule > cpw: ata_host_register call ata_port_wait_eh > cpw: ata_port_wait_eh entered > cpw: ata_port_wait_eh, preparing to wait > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > cpw: ata_port_wait_eh, finished wait > cpw: ata_port_wait_eh exiting > cpw: ata_host_register done with probe port 1 Hmmm.. PHY is up but for some reason libata thought there was no device there. Can you try the attached patch and report the result? -- tejun
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 6001aae..4bc042d 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -732,6 +732,9 @@ ata_dev_try_classify(struct ata_port *ap, unsigned int device, u8 *r_err) if (r_err) *r_err = err; + ata_port_printk(ap, KERN_ERR, "XXX: try_classif dev%d e=%02x %02x:%02x:%02x\n", + device, err, tf.lbal, tf.lbam, tf.lbah); + /* see if device passed diags: if master then continue and warn later */ if (err == 0 && device == 0) /* diagnostic fail : do nothing _YET_ */ @@ -3006,8 +3009,10 @@ int ata_wait_ready(struct ata_port *ap, unsigned long deadline) if (!(status & ATA_BUSY)) return 0; - if (!ata_port_online(ap) && status == 0xff) + if (!ata_port_online(ap) && status == 0xff) { + ata_port_printk(ap, KERN_ERR, "XXX: wait_ready status=0xff\n"); return -ENODEV; + } if (time_after(now, deadline)) return -EBUSY; @@ -3037,8 +3042,10 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask, if (dev0) { rc = ata_wait_ready(ap, deadline); if (rc) { - if (rc != -ENODEV) + if (rc != -ENODEV) { + ata_port_printk(ap, KERN_ERR, "XXX: post_reset dev0 rc=%d\n", rc); return rc; + } ret = rc; } } @@ -3067,8 +3074,10 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask, rc = ata_wait_ready(ap, deadline); if (rc) { - if (rc != -ENODEV) + if (rc != -ENODEV) { + printk("XXX: post_reset dev1 rc=%d\n", rc); return rc; + } ret = rc; } } @@ -3113,8 +3122,11 @@ static int ata_bus_softreset(struct ata_port *ap, unsigned int devmask, * the bus shows 0xFF because the odd clown forgets the D7 * pulldown resistor. */ - if (ata_check_status(ap) == 0xFF) + if (ata_check_status(ap) == 0xFF) { + ata_port_printk(ap, KERN_ERR, "XXX: status=0xFF altstatus=0x%x\n", + ata_altstatus(ap)); return -ENODEV; + } return ata_bus_post_reset(ap, devmask, deadline); } @@ -3402,6 +3414,8 @@ int ata_std_softreset(struct ata_port *ap, unsigned int *classes, if (slave_possible && ata_devchk(ap, 1)) devmask |= (1 << 1); + ata_port_printk(ap, KERN_ERR, "XXX: devmask=0x%x\n", devmask); + /* select device 0 again */ ap->ops->dev_select(ap, 0);