Re: No I/O errors reported after SATA link hard reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, Aug 17, 2017 at 04:15:35PM +0200, Gionatan Danti wrote:
> Ok, so *this* is the root cause of the problem: libata not
> identifying spurious link renegotiations vs brief powerloss/powerup
> events. Out of curiosity: is this a SATA-specific problem (ie: in
> the SATA specification), or even SAS disks are affected?

No idea about SAS.  They're identical at the link layer tho.

> >Because we don't wanna be ditching disks on temporary link glitches,
> >which do happen once in a while.
> 
> Any chances to report I/O errors to the upper layers *without*
> offlining the device? In this manner, upper layers (ie: MDRAID) can
> act in a more informate way. For example: single disk device will
> simple retry the failed operation, while MDRAID can take the
> "badblocks" code path to deal with the error.

Upper layer can request to avoid retrying on errors but it won't help
too much.  It doesn't have much to do with specific commands.  A power
event can take place without any command in flight and lose the
buffered data.  Unless upper layer is tracking all that's being
written, there isn't much it can do outside doing full scan.  This is
a condition which should be handled from the driver side.

> >So, the right way to deal with the problem probably is making use of
> >the SMART counter which indicates power loss events and verify that
> >the counter hasn't increased over link issues.  If it changed, the
> >device should be detached and re-probed, which will make it come back
> >as a different block device.  Unfortunately, I haven't had the chance
> >to actually implement that.
> 
> This is a very good idea, maybe I can implement it in userspace with
> a simple, fast polling scheme (for example, each 60 seconds). Such a
> polling would not prevent all corruption scenarios, but will at
> least timely inform the user.

Yeah, looking into getting it implemented on the kernel side.

Thanks.

-- 
tejun



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux