Re: sd 6:0:0:0: [sdb] Unaligned partial completion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/11/2018 02:40 PM, James Bottomley wrote:
On Mon, 2018-06-11 at 12:20 -0400, Douglas Gilbert wrote:
I have also seen Aborted Command sense when doing heavy testing on
one or more SAS disks behind a SAS expander. I put it down to a
temporary lack of paths available (on the link between the host's HBA
and the expander) when one of those SAS disks tries to get a
connection back to the host with the data (data-in transfer) from an
earlier READ command.

In my code (ddpt and sg_dd) I treat it as a "retry" type error and in
my experience that works. IOW a follow-up READ with the same
parameters is successful.

We do treat ABORTED_COMMAND as a retry.  However, it will tick down the
retry count (usually 3) and then fail if it still occurs.  How long
does this condition persist for? because if it's long lived we could
treat it as ADD_TO_MLQUEUE which would mean we'd retry until the
timeout condition was reached.

On my system, it's a bit hard to tell, as as soon as ZFS sees the read error, it starts resilvering to repair the sector that reported the I/O error. Without the scrub, it happened once over a 5-day window. During the scrub, it was usually 10s of minutes between occurrences that failed all the retries, but I had some occasions where it happened about 5-10 minutes apart. It definitely seems to be load-related, so how long and hard the load stays elevated is a factor.

--Ted




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux