Re: mpt2sas driver behaving strange with a failed SATA disk behind SAS expander.

Peter Chang <dpf@xxxxxxxxxx> · Wed, 17 Aug 2011 11:49:28 -0700

Le 17 août 2011 10:08, Peter Chang <dpf@xxxxxxxxxx> a écrit :
> Le 17 août 2011 07:25, Fredrik Lindgren <fli@xxxxxxxx> a écrit :
>> When doing disk IO on the disks (they are all configured in MD raids)
>> suddenly IO will
>> stop and these messages are printed on the console about once every second:
>>
>> mpt2sas0: log_info(0x31110610): originator(PL), code(0x11), sub_code(0x0610)
>>
>> From what I understand this means:
>>
>> PL_LOGINFO_CODE_RESET (0x00110000)
>> PL_LOGINFO_SUB_CODE_SATA_NON_NCQ_RW_ERR_BIT_SET (0x00000600)
>>
>> So a disk is acting up, generating errors? What does the last "10" mean in
>> the sub_code,
>> is that an identifier for which disk it is?
>
> no, the bottom bts are still part of the error code.
>
> i haven't run w/ your exact fw/driver setup, but i think you'll find
> that you're in a 'loop' where the driver is returning DID_RESET and
> the scsi layer is retrying w/o going through the retry counter logic
> (the command that fails is one that the firmware issued).

since someone else gave the error code (i didn't check if i just had
some other magic header)...

the problem is probably a combination of the disk and controller
firmwares. when an NCQ request fails the firmware will do a READ LOG
EXT(10) to figure out why. some disks don't do handle this sequence
the way the firmware expects so it starts the COMRESET dance w/ the
disk and returns an event w/ the loginfo to the driver/kernel.

the 'fix' (really a workaround) is in
mpt2sas_scsih.c:_scsih_io_done(). in the case for
MPI2_IOCSTATUS_SCSI_TASK_TERMINATED change the DID_RESET to
DID_SOFT_ERROR and the rest of the scsi layer will go down the regular
retry handling and you'll get out of the 'loop'.

lsi supposed to have this fix coming soon.

disabling NCQ will 'fix' this as well.

\p
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html