Are these SCSI devices all running on the same SCSI channel?? If so, (And IF the SCSI card is DUAL channel) put the other devices, tape, CDROM etc on the other channel. And leave the HDD on the main channel. I've noticed that this does tend to happen when slow devices are attached to the same SCSI channel. It may have appeared to have worked OK for months, but it does tend to cause problems from time to time. If your SCSI card is not dual channel, get another and transfer the TAPE, CD-ROM etc to the new card. You will find the system will respond much better and perform better. Wolf -----Original Message----- From: David I. Bell [mailto:dibl@xxxxxxxxxxx] Sent: Tuesday, 3 June 2003 5:45 PM To: psyche-list@xxxxxxxxxx Subject: SCSI Problem Kind Readers, Troubles recently started with my system. I'm running RH 8.0 (2.4.18-26.8.0smp) on a dual CPU P-II/200Mhz system. It's been up and running for 5 months or so. I just noticed that the system periodically hangs for several seconds (sometimes up to a minute or more) with the disk activity light solidy on. The system usually unblocks itself if I wait patiently. I've attached some text from /var/log/messages below. Does anyone know what the problem might be? I have a SCSI disk at SCSI ID 0, a tape device at SCSI ID 3, and a CD-ROM at SCSI ID 5. The tape and the CD-ROM were not in use at the time of the error. Do you think this might be a failing SCSI card or is it a failing disk drive? Could it be an O/S induced hang -- deadlock of some kind related to SMP? ============================================================== Jun 2 14:14:59 igor kernel: scsi0:0:0:0: Attempting to queue an ABORT message Jun 2 14:14:59 igor kernel: scsi0: Dumping Card State in Message-out phase, at SEQADDR 0x15f Jun 2 14:14:59 igor kernel: ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xc0, ARG_2 = 0xf Jun 2 14:14:59 igor kernel: HCNT = 0x0 SCBPTR = 0xf Jun 2 14:14:59 igor kernel: SCSISEQ = 0x12, SBLKCTL = 0x0 Jun 2 14:14:59 igor kernel: DFCNTRL = 0x4, DFSTATUS = 0x6d Jun 2 14:14:59 igor kernel: LASTPHASE = 0xa0, SCSISIGI = 0xb6, SXFRCTL0 = 0x88 Jun 2 14:14:59 igor kernel: SSTAT0 = 0x7, SSTAT1 = 0x3 Jun 2 14:14:59 igor kernel: STACK == 0xe4, 0xe4, 0x159, 0x189 Jun 2 14:14:59 igor kernel: SCB count = 120 Jun 2 14:14:59 igor kernel: Kernel NEXTQSCB = 43 Jun 2 14:14:59 igor kernel: Card NEXTQSCB = 67 Jun 2 14:14:59 igor kernel: QINFIFO entries: 67 51 Jun 2 14:14:59 igor kernel: Waiting Queue entries: Jun 2 14:14:59 igor kernel: Disconnected Queue entries: Jun 2 14:14:59 igor kernel: QOUTFIFO entries: Jun 2 14:14:59 igor kernel: Sequencer Free SCB List: 14 3 1 9 0 10 8 6 5 4 11 7 13 2 12 Jun 2 14:14:59 igor kernel: Sequencer SCB Info: 0(c 0x68, s 0x7, l 0, t 0xff) 1(c 0x68, s 0x7, l 0, t 0xff) 2(c 0x68, s 0x7, l 0, t 0xff) 3(c 0x68, s 0x7, l 0, t 0xff) 4(c 0x68, s 0x7, l 0, t 0xff) 5(c 0x68, s 0x7, l 0, t 0xff) 6(c 0x68, s 0x7, l 0, t 0xff) 7(c 0x68, s 0x7, l 0, t 0xff) 8(c 0x68, s 0x7, l 0, t 0xff) 9(c 0x68, s 0x7, l 0, t 0xff) 10(c 0x68, s 0x7, l 0, t 0xff) 11(c 0x68, s 0x7, l 0, t 0xff) 12(c 0x68, s 0x7, l 0, t 0xff) 13(c 0x68, s 0x7, l 0, t 0xff) 14(c 0x0, s 0x57, l 0, t 0xff) 15(c 0x0, s 0x57, l 0, t 0x3d) Jun 2 14:14:59 igor kernel: Pending list: 51(c 0x68, s 0x7, l 0), 67(c 0x68, s 0x7, l 0), 61(c 0x0, s 0x57, l 0) Jun 2 14:14:59 igor kernel: Kernel Free SCB list: 31 49 59 10 8 33 11 9 20 1 24 41 16 3 5 35 25 39 46 4 48 0 26 38 58 45 44 22 17 21 15 29 40 6 55 12 14 30 28 32 7 54 19 56 42 62 37 34 13 18 52 63 50 53 119 27 36 23 60 57 2 47 112 113 114 115 108 109 110 111 104 105 106 107 100 101 102 103 96 97 98 99 92 93 94 95 88 89 90 91 84 85 86 87 80 81 82 83 76 77 78 79 72 73 74 75 68 69 70 71 64 65 66 118 117 116 Jun 2 14:14:59 igor kernel: Untagged Q(5): 61 Jun 2 14:14:59 igor kernel: DevQ(0:0:0): 0 waiting Jun 2 14:14:59 igor kernel: DevQ(0:3:0): 0 waiting Jun 2 14:14:59 igor kernel: DevQ(0:5:0): 0 waiting Jun 2 14:14:59 igor kernel: scsi0:0:0:0: Cmd aborted from QINFIFO Jun 2 14:15:00 igor kernel: aic7xxx_abort returns 0x2002 Jun 2 14:15:00 igor kernel: scsi0:0:0:0: Attempting to queue an ABORT message Jun 2 14:15:00 igor kernel: scsi0:0:0:0: Command not found Jun 2 14:15:00 igor kernel: aic7xxx_abort returns 0x2002 Jun 2 14:15:00 igor kernel: scsi0:0:5:0: Attempting to queue an ABORT message Jun 2 14:15:00 igor kernel: scsi0:0:5:0: Command not found Jun 2 14:15:00 igor kernel: aic7xxx_abort returns 0x2002 ============================================================== Thanks in advance. -- David Bell dibl@xxxxxxxxxxx -- Psyche-list mailing list Psyche-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/psyche-list ###################################################################### Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ###################################################################### -- Psyche-list mailing list Psyche-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/psyche-list