On 08/17/11 07:25, Fredrik Lindgren wrote:
Hello,
I'm seeing something strange on a Supermicro 847E16-R1400. It has SAS
expanders
with SATA disks behind them (Seagate Barracuda XT). The SAS card is a
LSI SAS9211-8i.
When doing disk IO on the disks (they are all configured in MD raids)
suddenly IO will
stop and these messages are printed on the console about once every
second:
mpt2sas0: log_info(0x31110610): originator(PL), code(0x11),
sub_code(0x0610)
From what I understand this means:
PL_LOGINFO_CODE_RESET (0x00110000)
PL_LOGINFO_SUB_CODE_SATA_NON_NCQ_RW_ERR_BIT_SET (0x00000600)
So a disk is acting up, generating errors? What does the last "10"
mean in the sub_code,
is that an identifier for which disk it is?
After some time, the message changed:
mpt2sas0: log info(0x31111000): originator(PL), code(0x11), sub
code(0x1000)
Now the disk seems to have died completely?
PL_LOGINFO_CODE_RESET (0x00110000)
PL_LOGINFO_SUB_CODE_DSCVRY_SATA_INIT_TIMEOUT (0x00001000)
I think sub code (0x610) indicates "Error in SATA ReadLogExt SATA
command" and subsequently the disk drive failed
to initialize (SATA initialization timeout). Since you've connected
through Expander, the link between Disk and Expander
should be actively transmitting FIS frames. You can verify whether Disk
link is up by checking Expander Routing Tables.
Reduce the link speed (from 6 to 3 Gb/s) between HBA-Exp-Disk and try
disabling Native Cmd Queuing and see whether it helps.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html