Re: aic79xx driver - hotswap error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2006-08-28 at 14:15 +0200, Martin Zuziak wrote:
> Hello all
> 
> Hot-swapping doesn't seem to work with the aic79xx driver in kernel
> 2.6.17.9. Removing or adding a disk from/to a running system makes i/o
> to any disk on the bus fail.
> 
> The machine is an IBM x346 server with a x86_64 cpu and a aic7902 scsi
> controller.
> 
> A copy of the system log is here:
> http://www.math.ku.dk/~zuziak/tmp/aic79xx_error_2.6.17.9.log
> 
> It shows the result of removing the third disk: the first disk (the only
> one mounted) becomes inaccessible.
> 
> Kernel 2.6.15.7 seems to work but I have had no luck with newer kernels.
> 
> Has anyone seen hot-swapping work with the aic79xx driver in recent
> kernels?

Are you sure your system is hot swap safe?  The whole log mess begins
with "someone reset channel A" which means the card detected a bus reset
but it didn't initiate the reset.  That's either going to be because
your system shouldn't be hot swap plugged and it triggered a spike on
the reset pin, or because your hot swap drive setup resets the bus on
unplug intentionally.  Knowing that would help.

So, the driver managed to get into the ahd_pause_and_flushwork()
function, probably while trying to queue the abort SCB, and while there
it detected an infinite loop and printed out the "Infinite interrupt
loop, INTSTAT = 8" message.  The INTSTAT value of 0x08 maps to SCSIINT,
so next you would look at the SCSIINT1 and SCSIINT2 registers to see
just *what* is causing the loop.  There you see SSTAT1[0x20]:(SCSIRSTI).
This tells us the driver is *still* getting a SCSI Reset In interrupt
from the card, even over 1 minute after you pulled the drive.  So, the
reason your SCSI bus hung is because everything on the bus is being
subjected to an infinite bus reset condition.  The cause of this
happening is likely either A) your bus isn't hot swap safe and you hot
swapped anyway, and in the process you disconnected the termination
power source or termination itself or just plain flaked other devices on
the bus out or B) something in your hot swap enclosure is broken and
throws an infinite bus reset when the drive is removed.  Either way,
this is not what I would call expected behavior from the aic79xx driver,
I suspect that it is innocent here and that the hardware is to blame.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

Attachment: signature.asc
Description: This is a digitally signed message part


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux