On Mon, Apr 23, 2007 at 01:20:32PM -0400, Salyzyn, Mark wrote: > That is a failure to route the interrupts and is possibly an issue with > the kernel and the hardware, and not the driver directly (since there is > an expectation that request_irq will connect the interrupt to the > interrupt service routine). Judith reported success in the past with > this patch on her hardware, perhaps the motherboard on your system has > some odd BIOS setup of the hardware that is giving acpi or the apic some > headaches? Can you check out success or failure on other motherboards? > Please try the suggestions from the driver (safe flags)? > > Sincerely -- Mark Salyzyn > Hi Mark, We don't even go through BIOS in kexec and kdump. So BIOS should not be an issue. Looks like you sent some message to controller and then waiting for an interrupt from the controller as an indication of completion of command. In this case you never seem to get an interrupt hence timeout. To bypass this problem, I am now booting my second kernel with "irqpoll" command line option. This will make sure that aacraid interrupt handler gets invoked even if there is an interrupt routing issue. This option does help in progressing the things but it ends up corrupting something or other on the disk. In three attempts I get three types of errors. In first attempt I get continuous stream of following messages once root file system has been mounted. ============================================= sda1: rw=0, want=9261304112, limit=41945652 attempt to access beyond end of device sda1: rw=0, want=9261304112, limit=41945652 attempt to access beyond end of device sda1: rw=0, want=9261304112, limit=41945652 attempt to access beyond end of device sda1: rw=0, want=9261304112, limit=41945652 attempt to access beyond end of device sda1: rw=0, want=9261304112, limit=41945652 attempt to access beyond end of device ============================================ In second attempt, it mounted the file system but it found some issue with "resize" inode and asked me to run fsck manually. Which in turn deleted whole lot of inodes. In third attemt it panics later when it finds ext3 to be corrupted. ========================================= Creating block device nodes. Trying to resume from LABEL=SWAP-sda3 No suspend signature on swap, not resuming. Creating root device. Mounting root filesystem. EXT3-fs: Magic mismatch, very weird ! mount: error mouKernel panic - not syncing: Attempted to kill init! nting /dev/root =================================================== Following are relevant aacraid initiliazation messages on serial console. =================================================================== Adaptec aacraid driver (1.1-5[2437]-mh4) ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25 AAC0: kernel 5.2-0[11835] Jan 9 2007 AAC0: monitor 5.2-0[11835] AAC0: bios 5.2-0[11835] AAC0: serial 1625d1 AAC0: 64bit support enabled. AAC0: 64 Bit DAC enabled scsi0 : ServeRAID scsi 0:0:0:0: Direct-Access IBM x366 V1.0 PQ: 0 ANSI: 2 scsi 0:1:0:0: Direct-Access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5 scsi 0:1:1:0: Direct-Access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5 scsi 0:1:2:0: Direct-Access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5 scsi 0:3:0:0: Enclosure IBM SAS SES-2 DEVICE 0.09 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 429459456 512-byte hardware sectors (219883 MB) sd 0:0:0:0: [sda] Assuming Write Enabled sd 0:0:0:0: [sda] Assuming drive cache: write through sd 0:0:0:0: [sda] 429459456 512-byte hardware sectors (219883 MB) sd 0:0:0:0: [sda] Assuming Write Enabled sd 0:0:0:0: [sda] Assuming drive cache: write through sda: sda1 sda2 sda3 sda4 < sda5 > sd 0:0:0:0: [sda] Attached SCSI removable disk sd 0:0:0:0: Attached scsi generic sg0 type 0 scsi 0:1:0:0: Attached scsi generic sg1 type 0 scsi 0:1:1:0: Attached scsi generic sg2 type 0 scsi 0:1:2:0: Attached scsi generic sg3 type 0 scsi 0:3:0:0: Attached scsi generic sg4 type 13 ================================================ I am not sure why this reset leaves file system in corrupted state and is there a better way to handle this? Link syncing the existing commands before restarting it. Should one keep a dedicated partition on the disk and not mount it in first kernel. Mount this partition only in second kernel to save the dump. I shall have to test such configuration. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html