My machine consistently locks up after the system has powered up for around 20minutes. Each time it locks up the console dumps the following messages. I've read some posting in this lists and saying it is a hardware problem. However, the problem only exists when booting smp kernel. To me, it seems it is a problem come from a badly written driver rather than hardware. My kernel is 2.4.20-18smp and is running on a Tyan mother board, dual athlon with onbaord aic7899 , running md with RAID-1 mirroring, Seagate Cheetah 36GB SCSI . The system can only survive with non-smp kernels. Any help is appreciated. Thanks.
regards, David Chow
Jul 21 16:14:36 webserver kernel: scsi0:0:0:0: Attempting to queue an ABORT message
Jul 21 16:14:36 webserver kernel: scsi0: Dumping Card State while idle, at SEQADDR 0x8
Jul 21 16:14:36 webserver kernel: ACCUM = 0x0, SINDEX = 0x1f, DINDEX = 0xe4, ARG_2 = 0x0
Jul 21 16:14:36 webserver kernel: HCNT = 0x0 SCBPTR = 0xe
Jul 21 16:14:36 webserver kernel: SCSISEQ = 0x12, SBLKCTL = 0xa
Jul 21 16:14:36 webserver kernel: DFCNTRL = 0x0, DFSTATUS = 0x89
Jul 21 16:14:36 webserver kernel: LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
Jul 21 16:14:36 webserver kernel: SSTAT0 = 0x0, SSTAT1 = 0x8
Jul 21 16:14:36 webserver kernel: SCSIPHASE = 0x0
Jul 21 16:14:36 webserver kernel: STACK == 0x3, 0x108, 0x160, 0x0
Jul 21 16:14:36 webserver kernel: SCB count = 254
Jul 21 16:14:36 webserver kernel: Kernel NEXTQSCB = 63
Jul 21 16:20:32 webserver kernel: Card NEXTQSCB = 63
Jul 21 16:20:33 webserver kernel: 144 145 146 147 140 141 142 143 136 137 138 139 132 133 134 135 128 19 12 13 14 15 8 9 10 11 4 5 6 1 3 2 7 0
Jul 21 16:20:33 webserver kernel: DevQ(0:0:0): 0 waiting
Jul 21 16:20:33 webserver kernel: DevQ(0:2:0): 0 waiting
Jul 21 16:20:33 webserver kernel: scsi0:0:0:0: Cmd aborted from QINFIFO
Jul 21 16:20:33 webserver kernel: aic7xxx_abort returns 0x2002
Jul 21 16:20:33 webserver kernel: scsi0:0:0:0: Attempting to queue an ABORT message
Jul 21 16:20:33 webserver kernel: scsi0: Dumping Card State in Message-out phase, at SEQADDR 0x168
Jul 21 16:20:33 webserver kernel: ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xe4, ARG_2 = 0x1c
Jul 21 16:20:33 webserver kernel: HCNT = 0x0 SCBPTR = 0x6
Jul 21 16:20:33 webserver kernel: SCSISEQ = 0x12, SBLKCTL = 0xa
Jul 21 16:20:33 webserver kernel: DFCNTRL = 0x0, DFSTATUS = 0x89
Jul 21 16:20:33 webserver kernel: LASTPHASE = 0xa0, SCSISIGI = 0xa4, SXFRCTL0 = 0x88
Jul 21 16:20:33 webserver kernel: SSTAT0 = 0x0, SSTAT1 = 0x0
Jul 21 16:20:33 webserver kernel: SCSIPHASE = 0x0
Jul 21 16:20:33 webserver kernel: STACK == 0x175, 0xe7, 0xe7, 0xe7
Jul 21 16:20:33 webserver kernel: SCB count = 254
Jul 21 16:20:33 webserver kernel: Kernel NEXTQSCB = 56
Jul 21 16:20:33 webserver kernel: Card NEXTQSCB = 23
Jul 21 16:20:33 webserver kernel: QINFIFO entries: 23
Jul 21 16:20:33 webserver kernel: Waiting Queue entries:
Jul 21 16:20:33 webserver kernel: Disconnected Queue entries: 24:44 9:16
Jul 21 16:20:33 webserver kernel: QOUTFIFO entries:
Jul 21 16:20:33 webserver kernel: Sequencer Free SCB List: 13 28 20 10 22 26 18 30 7 11 23 2 8 17 4 3 16 29 25 1 14 5 12 27 21 15 19 0 31
Jul 21 16:20:33 webserver kernel: Sequencer SCB Info: 0(c 0x60, s 0x27, l 0, t 0xff) 1(c 0x64, s 0x7, l 0, t 0xff) 2(c 0x64, s 0x7, l 0, t 0xff) 3(c 0x64, s 0x7, l 0, t 0xff) 4(c 0x64, s 0x7, l 0, t 0xff) 5(c 0x60, s 0x27, l 0, t 0xff) 6(c 0x64, s 0x7, l 0, t 0x73) 7(c 0x64, s 0x7, l 0, t 0xff) 8(c 0x64, s 0x7, l 0, t 0xff) 9(c 0x64, s 0x7, l 0, t 0x10) 10(c 0x0, s 0x7, l 0, t 0xff) 11(c 0x64, s 0x7, l 0, t 0xff) 12(c 0x60, s 0x27, l 0, t 0xff) 13(c 0x0, s 0x7, l 0, t 0xff) 14(c 0x64, s 0x7, l 0, t 0xff) 15(c 0x60, s 0x27, l 0, t 0xff) 16(c 0x64, s 0x7, l 0, t 0xff) 17(c 0x64, s 0x7, l 0, t 0xff) 18(c 0x0, s 0x7, l 0, t 0xff) 19(c 0x60, s 0x27, l 0, t 0xff) 20(c 0x0, s 0x7, l 0, t 0xff) 21(c 0x60, s 0x27, l 0, t 0xff) 22(c 0x0, s 0x7, l 0, t 0xff) 23(c 0x64, s 0x7, l 0, t 0xff) 24(c 0x64, s 0x7, l 0, t 0x2c) 25(c 0x64, s 0x7, l 0, t 0xff) 26(c 0x0, s 0x7, l 0, t 0xff) 27(c 0x60, s 0x27, l 0, t 0xff) 28(c 0x0, s 0x7, l 0, t 0xff) 29(c 0x64, s 0x7, l 0, t 0xff) 30(c 0x64, s 0x7, l 0, t 0xff) 31(c 0x60, s 0x27, l 0,
Jul 21 16:20:33 webserver kernel: 0xff)
Jul 21 16:20:33 webserver kernel: Pending list: 51(c 0x64, s 0x7, l 0), 44(c 0x60, s 0x7, l 0), 78(c 0x64, s 0x7, l 0), 16(c 0x60, s 0x7, l 0), 130(c 0x64, s 0x7, l 0), 23(c 0x74, s 0x7, l 0), 115(c 0x64, s 0x7, l 0)
Jul 21 16:20:33 webserver kernel: Kernel Free SCB list: 18 106 58 36 121 41 126 39 26 92 66 89 95 191 60 116 34 85 113 83 27 24 109 67 100 80 49 63 31 20 52 94 77 105 253 108 48 117 57 59 17 110 55 71 123 91 30 81 82 88 122 75 28 93 107 90 96 251 38 40 54 114 61 111 42 70 125 102 37 25 74 68 21 73 79 131 104 43 101 33 72 249 127 46 84 87 50 62 120 29 65 35 97 119 22 47 64 76 103 45 112 86 99 195 32 98 124 250 248 118 69 53 190 189 188 252 244 245 246 247 240 241 242 243 236 237 238 239 232 233 234 235 228 229 230 231 224 225 226 227 220 221 222 223 216 217 218 219 212 213 214 215 208 209 210 211 204 205 206 207 200 201 202 203 196 197 198 199 192 193 194 129 184 185 186 187 180 181 182 183 176 177 178 179 172 173 174 175 168 169 170 171 164 165 166 167 160 161 162 163 156 157 158 159 152 153 154 155 148 149 150 151 144 145 146 147 140 141 142 143 136 137 138 139 132 133 134 135 128 19 12 13 14 15 8 9 10 11 4 5 6 1 3 2 7 0
Jul 21 16:20:33 webserver kernel: DevQ(0:0:0): 0 waiting
Jul 21 16:20:33 webserver kernel: DevQ(0:2:0): 0 waiting
Jul 21 16:20:33 webserver kernel: Recovery SCB completes
Jul 21 16:20:33 webserver kernel: (scsi0:A:0:0): Queuing a recovery SCB
Jul 21 16:20:33 webserver kernel: scsi0:0:0:0: Device is disconnected, re-queuing SCB
Jul 21 16:20:33 webserver kernel: Recovery code sleeping
Jul 21 16:20:33 webserver kernel: Recovery code awake
Jul 21 16:20:33 webserver kernel: aic7xxx_abort returns 0x2002
Jul 21 16:20:33 webserver kernel: scsi0:0:0:0: Attempting to queue an ABORT message
Jul 21 16:20:33 webserver kernel: scsi0: Dumping Card State in Message-out phase, at SEQADDR 0x168
Jul 21 16:20:33 webserver kernel: ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xe4, ARG_2 = 0x1c
Jul 21 16:20:33 webserver kernel: HCNT = 0x0 SCBPTR = 0x6
Jul 21 16:20:33 webserver kernel: SCSISEQ = 0x12, SBLKCTL = 0xa
Jul 21 16:20:33 webserver kernel: DFCNTRL = 0x0, DFSTATUS = 0x89
Jul 21 16:20:33 webserver kernel: LASTPHASE = 0xa0, SCSISIGI = 0xa4, SXFRCTL0 = 0x88
Jul 21 16:20:33 webserver kernel: SSTAT0 = 0x0, SSTAT1 = 0x0
Jul 21 16:20:33 webserver kernel: SCSIPHASE = 0x0
Jul 21 16:20:33 webserver kernel: STACK == 0x175, 0xe7, 0xe7, 0xe7
Jul 21 16:20:33 webserver kernel: SCB count = 254
Jul 21 16:20:33 webserver kernel: Kernel NEXTQSCB = 23
Jul 21 16:20:33 webserver kernel: Card NEXTQSCB = 78
Jul 21 16:20:33 webserver kernel: QINFIFO entries: 78 56
Jul 21 16:20:33 webserver kernel: Waiting Queue entries:
Jul 21 16:20:33 webserver kernel: Disconnected Queue entries: 24:44 9:16
Jul 21 16:20:33 webserver kernel: QOUTFIFO entries:
Jul 21 16:20:33 webserver kernel: Sequencer Free SCB List: 13 28 20 10 22 26 18 30 7 11 23 2 8 17 4 3 16 29 25 1 14 5 12 27 21 15 19 0 31
Jul 21 16:20:33 webserver kernel: Sequencer SCB Info: 0(c 0x60, s 0x27, l 0, t 0xff) 1(c 0x64, s 0x7, l 0, t 0xff) 2(c 0x64, s 0x7, l 0, t 0xff) 3(c 0x64, s 0x7, l 0, t 0xff) 4(c 0x64, s 0x7, l 0, t 0xff) 5(c 0x60, s 0x27, l 0, t 0xff) 6(c 0x64, s 0x7, l 0, t 0x73) 7(c 0x64, s 0x7, l 0, t 0xff) 8(c 0x64, s 0x7, l 0, t 0xff) 9(c 0x64, s 0x7, l 0, t 0x10) 10(c 0x0, s 0x7, l 0, t 0xff) 11(c 0x64, s 0x7, l 0, t 0xff) 12(c 0x60, s 0x27, l 0, t 0xff) 13(c 0x0, s 0x7, l 0, t 0xff) 14(c 0x64, s 0x7, l 0, t 0xff) 15(c 0x60, s 0x27, l 0, t 0xff) 16(c 0x64, s 0x7, l 0, t 0xff) 17(c 0x64, s 0x7, l 0, t 0xff) 18(c 0x0, s 0x7, l 0, t 0xff) 19(c 0x60, s 0x27, l 0, t 0xff) 20(c 0x0, s 0x7, l 0, t 0xff) 21(c 0x60, s 0x27, l 0, t 0xff) 22(c 0x0, s 0x7, l 0, t 0xff) 23(c 0x64, s 0x7, l 0, t 0xff) 24(c 0x64, s 0x7, l 0, t 0x2c) 25(c 0x64, s 0x7, l 0, t 0xff) 26(c 0x0, s 0x7, l 0, t 0xff) 27(c 0x60, s 0x27, l 0, t 0xff) 28(c 0x0, s 0x7, l 0, t 0xff) 29(c 0x64, s 0x7, l 0, t 0xff) 30(c 0x64, s 0x7, l 0, t 0xff) 31(c 0x60, s 0x27, l 0,
Jul 21 16:20:33 webserver kernel: 0xff)
Jul 21 16:20:33 webserver kernel: Pending list: 56(c 0x60, s 0x7, l 0), 51(c 0x64, s 0x7, l 0), 44(c 0x60, s 0x7, l 0), 78(c 0x74, s 0x7, l 0), 16(c 0x60, s 0x7, l 0), 130(c 0x64, s 0x7, l 0), 115(c 0x64, s 0x7, l 0)
Jul 21 16:20:33 webserver kernel: Kernel Free SCB list: 18 106 58 36 121 41 126 39 26 92 66 89 95 191 60 116 34 85 113 83 27 24 109 67 100 80 49 63 31 20 52 94 77 105 253 108 48 117 57 59 17 110 55 71 123 91 30 81 82 88 122 75 28 93 107 90 96 251 38 40 54 114 61 111 42 70 125 102 37 25 74 68 21 73 79 131 104 43 101 33 72 249 127 46 84 87 50 62 120 29 65 35 97 119 22 47 64 76 103 45 112 86 99 195 32 98 124 250 248 118 69 53 190 189 188 252 244 245 246 247 240 241 242 243 236 237 238 239 232 233 234 235 228 229 230 231 224 225 226 227 220 221 222 223 216 217 218 219 212 213 214 215 208 209 210 211 204 205 206 207 200 201 202 203 196 197 198 199 192 193 194 129 184 185 186 187 180 181 182 183 176 177 178 179 172 173 174 175 168 169 170 171 164 165 166 167 160 161 162 163 156 157 158 159 152 153 154 155 148 149 150 151 144 145 146 147 140 141 142 143 136 137 138 139 132 133 134 135 128 19 12 13 14 15 8 9 10 11 4 5 6 1 3 2 7 0
Jul 21 16:20:33 webserver kernel: DevQ(0:0:0): 0 waiting
Jul 21 16:20:33 webserver kernel: DevQ(0:2:0): 0 waiting
Jul 21 16:20:33 webserver kernel: scsi0:0:0:0: Cmd aborted from QINFIFO
Jul 21 16:20:33 webserver kernel: aic7xxx_abort returns 0x2002
-- Psyche-list mailing list Psyche-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/psyche-list