This is a bug i've been playing with for a while now, but i think i've narrowed it down about as far as i can without additional help. I belive we are hitting this bit of code, as i have seen the message before in previous panics. toss_command: printk(KERN_EMERG "qlogicpti%d: request queue overflow\n", qpti->qpti_id); /* Unfortunately, unless you use the new EH code, which * we don't, the midlayer will ignore the return value, * which is insane. We pick up the pieces like this. */ Cmnd->result = DID_BUS_BUSY; done(Cmnd); return 1; Correct me if i'm wrong, but i don't how we're pickuping up any peices here. Something went wrong and SCSI requests built up to an unmanageable point and we just say the bus is busy? Granted i'm not really sure how you would pick up any peices in that case unless you set the bus busy before the queue were to overflow and just try to wait it out. Take a look at the iostat information below. iostat was configured to refresh every second, the bottom was cut off during the panic. iostat log > http://pastebin.com/ea96AucT >From this you can see that sdc was the first drive to stop responding. it's r/s and w/s drop to zero but the util% stays at 100. Shortly after, the request queue overflows and sets the whole bus to busy which can be seen in the last portion of the log (which didn't finish as the system panic'd). All of the disks on that bus have subsequently followed suite with sdc because the bus is essnstially screeching to a halt. Below you will find the kernel panic. kernel panic > http://pastebin.com/n9agfz1z Again, correct me if i'm wrong, but it would seem that any pointers pointing towards the request queue are now invalid as the queue has overflown. Below is just some conjecture on my part. Should the correct behaviour here not be to fail the disk that is holding up the rest of the bus? From what i see, it is quite likely sdc is bad so i will be replacing it, however having the whole system panic because of a bad disk seems counter intuitive. I realise this is quite an old driver, and may have been written before we had ways of dealing with these types of issues. Or perhaps even, it's a a hardware limitation that prevents up from pinpointing what is acutally no longer responding on the bus? -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html