Re: Slow initial resync in RAID6 with 36 SAS drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 19, 2021 at 11:28 AM Marcin Wanat <marcin.wanat@xxxxxxxxx> wrote:
>
> Sorry, this will be a long email with everything I find to be relevant.
> I have a mdraid6 array with 36 hdd SAS drives each able to do
> >200MB/s, but I am unable to get more than 38MB/s resync speed on a
> fast system (48cores/96GB ram) with no other load.

I have done a bit more research on 24 NVMe drives server and found
that resync speed bottleneck affect RAID6 with >16 drives:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=16
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
/dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1
/dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1
/dev/nvme16n1
# iostat -dx 5
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1          0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme1n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.88    0.00   0.99   470.84     2.25   2.51  86.04
nvme4n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.89    0.00   0.99   470.84     2.25   2.51  86.06
nvme5n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.89    0.00   0.99   470.84     2.25   2.51  86.14
nvme10n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.90    0.00   0.99   470.84     2.25   2.51  86.20
nvme9n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.91    0.00   1.00   470.84     2.25   2.53  86.76
nvme13n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.93    0.00   1.00   470.84     2.25   2.54  87.00
nvme12n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.08
nvme8n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.93    0.00   1.00   470.84     2.25   2.54  87.02
nvme14n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.96    0.00   1.01   470.84     2.25   2.56  87.64
nvme22n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme17n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme16n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    3.05    0.00   1.04   470.84     2.25   2.58  88.56
nvme19n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme2n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.20
nvme6n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.95    0.00   1.01   470.84     2.25   2.55  87.52
nvme7n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.22
nvme21n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme11n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.96    0.00   1.02   470.84     2.25   2.56  87.72
nvme15n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.99    0.00   1.02   470.84     2.25   2.53  86.84
nvme23n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme18n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme3n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.97    0.00   1.02   470.84     2.25   2.53  86.66
nvme20n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

as you can see, there are 342 iops with ~470 rareq-sz, but when i
create RAID6 with 17 drives or more:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=17
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
/dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1
/dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1
/dev/nvme16n1 /dev/nvme17n1
# iostat -dx 5
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1          0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme1n1       21484.20    0.40  85936.80      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.82     4.00     2.25   0.05  99.16
nvme4n1       21484.00    0.40  85936.00      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.74     4.00     2.25   0.05  99.16
nvme5n1       21484.00    0.40  85936.00      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.84     4.00     2.25   0.05  99.16
nvme10n1      21483.80    0.40  85935.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.65     4.00     2.25   0.04  83.64
nvme9n1       21483.80    0.40  85935.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.67     4.00     2.25   0.04  85.86
nvme13n1      21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.63     4.00     2.25   0.04  83.66
nvme12n1      21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.65     4.00     2.25   0.04  83.66
nvme8n1       21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.81     4.00     2.25   0.05  99.22
nvme14n1      21481.80    0.40  85927.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.67     4.00     2.25   0.04  83.66
nvme22n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme17n1      21482.00    0.40  85928.00      0.90     0.00     0.00
0.00   0.00    0.02    0.00   0.49     4.00     2.25   0.03  67.12
nvme16n1      21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.75     4.00     2.25   0.04  83.66
nvme19n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme2n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.95     4.00     2.25   0.05  99.26
nvme6n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.91     4.00     2.25   0.05  99.26
nvme7n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.87     4.00     2.25   0.05  99.24
nvme21n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme11n1      21481.20    0.40  85924.80      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.75     4.00     2.25   0.04  83.66
nvme15n1      21480.20    0.40  85920.80      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.80     4.00     2.25   0.04  83.66
nvme23n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme18n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme3n1       21480.40    0.40  85921.60      0.90     0.00     0.00
0.00   0.00    0.05    0.00   1.02     4.00     2.25   0.05  99.26
nvme20n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

rareq-sz drops to 4, iops increase to 21483 and resync speed drops to 85MB/s.

Why is it like that? Could someone let me know which part of mdraid
kernel code is responsible for this limitation ? Is changing this and
recompiling the kernel on machine with 512GB+ ram safe ?

Regards,
Marcin Wanat



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux