FW: change in disk failure policy for non-BBL arrays?

Chris Walker <cwalker@xxxxxxxx> · Fri, 3 Nov 2017 19:58:28 +0000

Hello,
I was looking at this again today and it appears that with this change, error handling no longer works correctly in RAID10 (I haven't checked the other levels yet).  Without a BBL configured, an error cycles through fix_read_error until max_read_errors is exceeded, and only then is the drive kicked out of the array.  For example, if I inject errors in response to both read and write commands at sector 16392 of /dev/sda, logs in response to a read of the corresponding md0 sector look like:

(many repeats)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: unable to read back corrected sectors (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: read correction write failed (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: unable to read back corrected sectors (8 sectors at 16392 on sda)
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: failing drive
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: Raid device exceeded read_error threshold [cur 21:max 20]
Oct 27 16:15:16 c1 kernel: md/raid10:md0: sda: Failing raid device
Oct 27 16:15:16 c1 kernel: md/raid10:md0: Disk failure on sda, disabling device.

Previously, the drive would have been failed out of the array by the call of md_error at the end of r10_sync_page_io.

Is there an appetite for a patch that takes the easy way out by reverting to the previous behavior with changes like

-       if (!rdev_set_badblocks(rdev, sector, sectors, 0))
+       if (!rdev_set_badblocks(rdev, sector, sectors, 0) || rdev->badblocks.shift < 0)

Thanks,
Chris

On 10/23/17, 5:23 PM, "Chris Walker" <cwalker@xxxxxxxx> wrote:

    Hello,

    We've noticed that for an array on which the bad block list has been 
    disabled, a failed write from a 'check' operation no longer causes the 
    offending disk to be failed out of the array.  As far as I can tell, 
    this behavior changed with commit
    https://github.com/torvalds/linux/commit/fc974ee2bffdde47d1e4b220cf326952cc2c4794, 
    which adopted the block layer badblocks code and deprecated the 
    MD-specific code.

    It looks like this commit changed underlying code that adds a range of 
    bad blocks to the BB table (md_set_badblocks --> badblocks_set) such 
    that the sense of the return code reversed, from 0 meaning an error 
    occurred to 0 meaning success, but the return code due to a disabled BB 
    was left at 0.  With this change, therefore, for arrays without a BBL, 
    calls to 'rdev_set_badblocks' changed from always a failure to always a 
    success, and code such as

                                     if (rdev_set_badblocks(
                                                 rdev,
    r10_bio->devs[m].addr,
                                                 r10_bio->sectors, 0))
                                             md_error(conf->mddev, rdev);

    that previously would have failed the disk no longer do.  Was this 
    change in policy deliberate?

    Thanks,
    Chris

��.n��������+%������w��{.n�����{����w��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f