Re: mdadm -> BTRFS conversion

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Fri, 2 Apr 2021 13:12:37 -0600

On Fri, Apr 2, 2021 at 4:23 AM Patrick O'Callaghan
<pocallaghan@xxxxxxxxx> wrote:
>
> On Thu, 2021-04-01 at 23:52 -0600, Chris Murphy wrote:
> > It's not an SMR concern, it's making sure the drive gives up on
> > errors
> > faster than the kernel tries to reset due to what it thinks is a
> > hanging drive.
> >
> > smartctl -l scterc /dev/sdX
> >
> > That'll tell you the default setting. I'm pretty sure Blues come with
> > SCT ERC disabled. Some support it. Some don't. If it's supported
> > you'll want to set it for something like 70-100 deciseconds (the
> > units
> > SATA drives use for this feature).
>
> One doesn´t and one does:
>
> # smartctl -l scterc /dev/sdd
> smartctl 7.2 2021-01-17 r5171 [x86_64-linux-5.11.10-200.fc33.x86_64]
> (local build)
> Copyright (C) 2002-20, Bruce Allen, Christian Franke,
> www.smartmontools.org
>
> SCT Error Recovery Control command not supported
>
> # smartctl -l scterc /dev/sde
> smartctl 7.2 2021-01-17 r5171 [x86_64-linux-5.11.10-200.fc33.x86_64]
> (local build)
> Copyright (C) 2002-20, Bruce Allen, Christian Franke,
> www.smartmontools.org
>
> SCT Error Recovery Control:
>            Read:     85 (8.5 seconds)
>           Write:     85 (8.5 seconds)
>
> So I guess the /dev/sde drive is set correctly, right? Or would you
> recommend disabling SCT ERC for this drive?

Leave /dev/sde alone, 85 deciseconds is fine.

Not much can be done with /dev/sdd itself directly. But it is possible
to increase the kernel's command timer for this drive. The usual way
of doing this is via sysfs.  I think it can be done with a udev rule
as well, but I'm having a bit of a lapse how to do it. Udev needs to
identify the device by serial number or wwn, but changing the timeout
via sysfs requires knowing that the /dev node is - which of course can
change each time you boot or plug the device in. I don't know enough
about udev. But there should be examples on the internet or you can
just fudge it with the linux-raid wiki guide.

The alternatives? Change the timeout for all /dev/ nodes. That's how
things are by default on Windows and macOS, they just wait a long time
before resetting a drive, giving it enough time for it to give up on
its own. The negative side effect is you might get a long delay
without errors, should the device develop marginally bad sectors.

Another alternative is to just leave it alone, and periodically check
(manually or automate it somehow) for the telltale signs of bad
sectors masked by SATA link resets.

Looks like this:

kernel: ata7.00: status: { DRDY }
kernel: ata7.00: failed command: READ FPDMA QUEUED
kernel: ata7.00: cmd
60/40:f0:98:d2:2b/05:00:45:00:00/40 tag 30 ncq dma 688128 in
                                           res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

With this interlaced occasionally

kernel: ata7: hard resetting link

If it happens *then* you can increase the timeout manually, and
initiate a scrub. As long as the timeout is set high enough (most
sources suggest 180 seconds which, yes, it's incredible) eventually
the drive will give up, spit out an error, and Btrfs will fix up that
sector by overwriting it with good data. It could be months, years, or
never, before it happens.

-- 
Chris Murphy
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure