Re: mdadm -> BTRFS conversion

Roger Heflin <rogerheflin@xxxxxxxxx> · Fri, 2 Apr 2021 14:29:56 -0500

I turn my scterc down as low as the drive will allow.  How low I can
go varies by model.  I have a loop that starts at 70 and then keeps
going down such that it will end up setting each disk as low is
allowed as far down as 10.  My wd reds allow a min of 20, and I have a
seagate that allows 10.

But even set to 10 (1.0 sec) at a slow 15ms/retry that is 66 retries
reading the block, if it has not got it in that many tries then the
drive might as well give up and let mdadm or something rewrite good
data to the disk.

On Fri, Apr 2, 2021 at 2:13 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Apr 2, 2021 at 4:23 AM Patrick O'Callaghan
> <pocallaghan@xxxxxxxxx> wrote:
> >
> > On Thu, 2021-04-01 at 23:52 -0600, Chris Murphy wrote:
> > > It's not an SMR concern, it's making sure the drive gives up on
> > > errors
> > > faster than the kernel tries to reset due to what it thinks is a
> > > hanging drive.
> > >
> > > smartctl -l scterc /dev/sdX
> > >
> > > That'll tell you the default setting. I'm pretty sure Blues come with
> > > SCT ERC disabled. Some support it. Some don't. If it's supported
> > > you'll want to set it for something like 70-100 deciseconds (the
> > > units
> > > SATA drives use for this feature).
> >
> > One doesn´t and one does:
> >
> > # smartctl -l scterc /dev/sdd
> > smartctl 7.2 2021-01-17 r5171 [x86_64-linux-5.11.10-200.fc33.x86_64]
> > (local build)
> > Copyright (C) 2002-20, Bruce Allen, Christian Franke,
> > www.smartmontools.org
> >
> > SCT Error Recovery Control command not supported
> >
> > # smartctl -l scterc /dev/sde
> > smartctl 7.2 2021-01-17 r5171 [x86_64-linux-5.11.10-200.fc33.x86_64]
> > (local build)
> > Copyright (C) 2002-20, Bruce Allen, Christian Franke,
> > www.smartmontools.org
> >
> > SCT Error Recovery Control:
> >            Read:     85 (8.5 seconds)
> >           Write:     85 (8.5 seconds)
> >
> > So I guess the /dev/sde drive is set correctly, right? Or would you
> > recommend disabling SCT ERC for this drive?
>
> Leave /dev/sde alone, 85 deciseconds is fine.
>
> Not much can be done with /dev/sdd itself directly. But it is possible
> to increase the kernel's command timer for this drive. The usual way
> of doing this is via sysfs.  I think it can be done with a udev rule
> as well, but I'm having a bit of a lapse how to do it. Udev needs to
> identify the device by serial number or wwn, but changing the timeout
> via sysfs requires knowing that the /dev node is - which of course can
> change each time you boot or plug the device in. I don't know enough
> about udev. But there should be examples on the internet or you can
> just fudge it with the linux-raid wiki guide.
>
> The alternatives? Change the timeout for all /dev/ nodes. That's how
> things are by default on Windows and macOS, they just wait a long time
> before resetting a drive, giving it enough time for it to give up on
> its own. The negative side effect is you might get a long delay
> without errors, should the device develop marginally bad sectors.
>
> Another alternative is to just leave it alone, and periodically check
> (manually or automate it somehow) for the telltale signs of bad
> sectors masked by SATA link resets.
>
> Looks like this:
>
> kernel: ata7.00: status: { DRDY }
> kernel: ata7.00: failed command: READ FPDMA QUEUED
> kernel: ata7.00: cmd
> 60/40:f0:98:d2:2b/05:00:45:00:00/40 tag 30 ncq dma 688128 in
>                                            res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>
> With this interlaced occasionally
>
> kernel: ata7: hard resetting link
>
> If it happens *then* you can increase the timeout manually, and
> initiate a scrub. As long as the timeout is set high enough (most
> sources suggest 180 seconds which, yes, it's incredible) eventually
> the drive will give up, spit out an error, and Btrfs will fix up that
> sector by overwriting it with good data. It could be months, years, or
> never, before it happens.
>
>
> --
> Chris Murphy
> _______________________________________________
> users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure