Re: Adaptec ASR-51245 and aacraid driver timeouts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/24/20 2:33 PM, David C. Partridge wrote:
Hi again,

I know there's been a lot of activity lately, so this could well have been
missed.

I'd dearly like to be able to power the drives down when the array is idle,
but this problem seems to make that impossible.

Are any of the folks that know the Adaptec raid cards and the aacraid driver
here?

Thanks
David

-----Original Message-----
From: David C. Partridge [mailto:david.partridge@xxxxxxxxxxxxx]
Sent: 21 October 2020 13:02
To: linux-scsi@xxxxxxxxxxxxxxx
Subject: Adaptec ASR-51245 and aacraid driver timeouts

I'm running LUbuntu x64 20.04.1 kernel 5.4.0-52-generic with an Adapted
ASR-51245 hosting a RAID-5 array.

If I configure the card to power down the drives in the raid array after a
period of idleness, the next time my server attempts to access the logical
device I get:

Oct 19 04:03:03 charon kernel: aacraid: Host adapter abort request.
                                aacraid: Outstanding commands on (0,0,0,0):
Oct 19 04:03:03 charon kernel: aacraid: Host adapter reset request. SCSI
hang ?
Oct 19 04:03:18 charon kernel: aacraid: Host adapter reset request. SCSI
hang ?
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd:
midlevel-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd:
lowlevel-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: error
handler-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd:
firmware-1
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd:
kernel-0
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: Device offlined - not ready after
error recovery
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: [sda] tag#215 FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: [sda] tag#215 CDB: Read(16) 88 00
00 00 00 00 00 05 27 48 00 00 00 08 00 00
Oct 19 04:03:48 charon kernel: blk_update_request: I/O error, dev sda,
sector 337736 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
Oct 19 04:03:48 charon kernel: BTRFS error (device sda1): bdev /dev/sda1
errs: wr 1, rd 1, flush 0, corrupt 3, gen 0

at which point the drive is now effectively offline :/

I tried upping the timeout:

root@charon:/etc/udev/rules.d# cat 99-aacraid.rules
SUBSYSTEM=="block", ACTION=="add", ENV{ID_VENDOR}=="Adaptec",
ENV{ID_MODEL}=="Shared", RUN+="/bin/sh -c 'echo 135 >
/sys/block/%k/device/timeout'"

but that didn't appear to stop the problem occurring (and the kernel wasn't
over happy about a >120s timeout).

Any help much appreciated.
David

Can you send a 'START/STOP UNIT' command to the device,
eg via sg_start /dev/sda?

It looks to me as if the devices are simply spun down, and for some reason the driver doesn't report this correctly.

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux