Hi James, I recently started fiddling with the emulex lpfc driver with the idea of adding PCI error recovery support to the driver. I'm trying to figure out how to proceed. Some background: In IBM pSeries, and now newer PCI-E based systems, things like parity errors, etc. on the PCI bus are detected by the PCI bridge chip, which then freezes all further traffic to the adapter. When an error condition is detected, there's a handful of callbacks made to the device driver, which can then try to recover from the error, and move forward. When io is frozen, mmio reads return all 0xffff's ... I injected an error on the lpfc, and the (so far, completely unmodified) driver promptly crashed on me: 0:mon> excp cpu 0x0: Vector: 300 (Data Access) at [c0000003fbed3890] pc: d000000000aa23c0: .lpfc_dev_loss_tmo_callbk+0x68/0x238 [lpfc] lr: c0000000002e9dac: .fc_starget_delete+0x90/0x17c sp: c0000003fbed3b10 msr: 9000000000009032 dar: 6b6b6b6b6b6b7753 dsisr: 40000000 current = 0xc0000003fa4ac7f0 paca = 0xc000000000523300 pid = 4714, comm = fc_wq_1 0:mon> t [c0000003fbed3bf0] c0000000002e9dac .fc_starget_delete+0x90/0x17c [c0000003fbed3c80] c0000000002ebc5c .fc_rport_final_delete+0x80/0x124 [c0000003fbed3d20] c000000000067268 .run_workqueue+0xdc/0x168 [c0000003fbed3dc0] c000000000067d0c .worker_thread+0x140/0x1b0 [c0000003fbed3ee0] c00000000006c24c .kthread+0x124/0x174 [c0000003fbed3f90] c000000000024d20 .kernel_thread+0x4c/0x68 This is on 2.6.19-rc1-git11 -- I'll try to track this down further, but thought I'd mention it now. Does sucha crash look familiar? -- Linas Vepstas - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html