Re: Bizarre RAID "failure"

Nathan Hunsperger <linux-raid@hunsperger.com> · Fri, 20 Feb 2004 00:33:27 -0800

On Thu, Feb 19, 2004 at 02:44:52PM -0800, Tom Maddox wrote:
<SNIP>
> If the system goes down unexpectedly (e.g., because of a power failure),
> the RAID array comes back up dirty and begins to rebuild itself, which
> is odd enough on its own.  What's worse is that, whenever this happens,
> the rebuild hangs at about 2.4%.  When it reaches that point, the array
> becomes totally nonresponsive--I can't even query its status with mdadm
> or any other tool, although I can use "cat /proc/mdstat" to see the
> status of the rebuild.  Any command that attempts to access the RAID
> drive hangs.
<SNIP>
> Has anyone seen this behavior before, and can you recommend a solution?

Tom,

I have had problems very similar to this before.  I was running 14 fibre
channel disks on a QLA2100 HBA w/ various 2.4 kernels.  What I found
was that after a while of heavy IO, all access to the disks stopped,
and the rebuild would hang.  Additionally, any command that required
access to any filesystem data that wasn't cached (on any filesystem)
would hang.  By switching between the 3 or so available QLA drivers,
I could affect the delta between reboot and stall.  I knew the hardware
was fine, as it worked flawlessly under Solaris.  In the end, I had
to upgrade the HBA to a QLA2200, at which time I had no more problems.
Because the hardware works under different OSs, I have to believe that
my problem was an incompatability between the QLA2100 and the drivers
(even though they claimed to work for it).

I hope that at least gives you some possible insight.

- Nathan

> 
> Thanks,
> 
> Tom
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html