recovery does not complete after --add

Eyal Lebedinsky <eyal@xxxxxxxxxxxxxx> · Fri, 17 Mar 2017 10:51:59 +1100

This is a repost of the issue (from a month ago) that did not get a response then.

Executive summary:
After '--add'ing a new member a 'recovery' starts automatically but 'sync_max' is not reset
and the recovery hangs part way through where sync_max happened to be. This is a 7 disk raid6.

Is this a known issue? Was it fixed since? Did I do something wrong?

This machine runs the older f19.
    $ uname -a
Linux e7.eyal.emu.id.au 3.14.27-100.fc19.x86_64 #1 SMP Wed Dec 17 19:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

mdadm was built from source:
    $ sudo mdadm --version
mdadm - v4.0 - 2017-01-09

The long story:

I had a disk fail in a raid6. After some 'pending' sectors were logged I decided to do a 'check'
around that location by setting sync_min/max and echo 'check'. This is done with a script doing:
    # echo 4336657408 >sys/block/md127/md/sync_min
    # echo 4339803136 >sys/block/md127/md/sync_max
    # echo check      >sys/block/md127/md/sync_action
The messages then say
    Feb 18 13:46:31 e7 kernel: [  976.688691] md: data-check of RAID array md127
    Feb 18 13:46:31 e7 kernel: [  976.693254] md: minimum _guaranteed_  speed: 150000 KB/sec/disk.
    Feb 18 13:46:31 e7 kernel: [  976.699479] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
    Feb 18 13:46:31 e7 kernel: [  976.709420] md: using 128k window, over a total of 3906885120k.
    Feb 18 13:46:31 e7 kernel: [  976.715457] md: resuming data-check of md127 from checkpoint.

Sure enough this elicited disk errors, but the disk did not recover and it was kicked out of the array.
Moreover it became unresponsive. It needed a power cycle so I shutdown and rebooted the machine.

messages:
    ... many i/o errors then sdf completely disappeared ... errors at sectors 4337414{000,040,168}
    Feb 18 13:47:08 e7 kernel: [ 1014.334781] md: super_written gets error=-5, uptodate=0
    Feb 18 13:47:08 e7 kernel: [ 1014.340024] md/raid:md127: Disk failure on sdf1, disabling device.
    Feb 18 13:47:08 e7 kernel: [ 1014.340024] md/raid:md127: Operation continuing on 6 devices.
    Feb 18 13:47:08 e7 kernel: [ 1014.417307] md: md127: data-check interrupted.

A second power off/on, a check produced the same result. At this point I added a fresh disk:
    $ sudo mdadm /dev/md127 --add /dev/sdj1
    $ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid6 sdj1[11] sdf1[7](F) sdi1[8] sde1[9] sdh1[12] sdc1[0] sdg1[13] sdd1[10]
      19534425600 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [UUU_UUU]
      [>....................]  recovery =  0.7% (29805572/3906885120) finish=509.2min speed=126880K/sec
      bitmap: 7/30 pages [28KB], 65536KB chunk

messages:
    Feb 18 14:23:10 e7 kernel: [ 3177.183250] md: bind<sdj1>
    Feb 18 14:23:10 e7 kernel: [ 3177.255529] md: recovery of RAID array md127
    Feb 18 14:23:10 e7 kernel: [ 3177.259894] md: minimum _guaranteed_  speed: 150000 KB/sec/disk.
    Feb 18 14:23:10 e7 kernel: [ 3177.265994] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
    Feb 18 14:23:10 e7 kernel: [ 3177.275736] md: using 128k window, over a total of 3906885120k.

However, the recovery stopped progressing at one point (my script logs /proc/mdstat every 10 seconds):
    2017-02-18 20:02:48        [===========>.........]  recovery = 55.4% (2166229192/3906885120) finish=372.8min speed=77803K/sec
    2017-02-18 20:02:58        [===========>.........]  recovery = 55.4% (2167083344/3906885120) finish=366.2min speed=79159K/sec
    2017-02-18 20:03:08        [===========>.........]  recovery = 55.4% (2167819876/3906885120) finish=374.8min speed=77316K/sec
    2017-02-18 20:03:18        [===========>.........]  recovery = 55.5% (2168520428/3906885120) finish=375.4min speed=77157K/sec
    2017-02-18 20:03:28        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=489.4min speed=59194K/sec
    2017-02-18 20:03:38        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=608.7min speed=47588K/sec
    2017-02-18 20:03:48        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=728.1min speed=39786K/sec
    2017-02-18 20:03:58        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=847.5min speed=34182K/sec
    ... no progress anymore
    2017-02-18 22:36:44        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=110261.8min speed=262K/sec
    2017-02-18 22:36:54        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=110381.2min speed=262K/sec
    2017-02-18 22:37:04        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=110500.6min speed=262K/sec
    2017-02-18 22:37:14        [===========>.........]  recovery = 55.5% (2168590848/3906885120) finish=110619.9min speed=261K/sec

After some thinking I realised that it has paused at the point where the earlier 'check' failed. This was unexpected.
I followed with
    # echo 'max' >/sys/block/md127/md/sync_max
the recovery now moves on:
    2017-02-18 22:37:24        [===========>.........]  recovery = 55.5% (2168938500/3906885120) finish=117500.2min speed=246K/sec
    2017-02-18 22:37:34        [===========>.........]  recovery = 55.5% (2169997568/3906885120) finish=105201.7min speed=275K/sec
    2017-02-18 22:37:44        [===========>.........]  recovery = 55.5% (2171066120/3906885120) finish=90962.0min speed=318K/sec
    2017-02-18 22:37:54        [===========>.........]  recovery = 55.5% (2172125192/3906885120) finish=269.9min speed=107101K/sec
    2017-02-18 22:38:04        [===========>.........]  recovery = 55.6% (2173114372/3906885120) finish=272.1min speed=106165K/sec
    2017-02-18 22:38:14        [===========>.........]  recovery = 55.6% (2174004224/3906885120) finish=287.3min speed=100492K/sec

### and it completed over six hours later:
    Feb 19 04:49:16 e7 kernel: [55167.633100] md: md127: recovery done.

TIA

--
Eyal Lebedinsky (eyal@xxxxxxxxxxxxxx)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html