Re: mptsas problem

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Sun, 13 Apr 2008 11:58:45 -0500

On Sun, 2008-04-13 at 12:48 -0400, Wakko Warner wrote:
> James Bottomley wrote:
> > On Sun, 2008-04-06 at 21:04 -0400, Wakko Warner wrote:
> > > > From the message you posted, it looks as though there may be a problem 
> > > > with sda.
> > > 
> > > It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> > > as stated before, it's 64)
> > > 
> > > I performed the same copy again with queue_depth=1 after the array rebuilt.
> > > It worked fine then.  No errors.
> > 
> > Actually, I'd say this is a signal for NCQ errors with the drive.
> 
> Unless it's this specific drive firmware, I'd have to disagree.  I have 6 of
> the exact same drives (can't confirm firmware is the same though) in raid5
> on an aic9410 sas controller w/o problems.  The queue_depth for those are
> 31.  I considered setting that value to the ones I'm having problems with,
> but I really don't want to go through another 4 hour rebuild.

Well, yes, different revs of the firmware can behave differently.  The
libata-core blacklist includes the firmware version as part of the
pattern matching.

There's an easy way to verify:  smartctl -i will print the firmware
version string.

> > I'm afraid only LSI would be able to say for certain, because the mptsas
> > implements its NCQ handling in firmware.  libata-core doesn't show any
> > special workarounds for your device (ST3750640AS) but that doesn't mean
> > there isn't a problem.  If it's really an NCQ implementation issue, then
> > clamping the queue depth to 1 is about the only fix, I'm afraid.
> 
> If it survives another week, I'd say using depth of 1 worked.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html