Re: Raid6 check performance regression 5.15 -> 5.16

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have tested this.  The patch seems to fix the issue.

Test method was:

fedora 5.16.11-200 (check broken taking about 4h50m to 5h6min-2runs
that I have data for)
kernel.org 5.16.13 + this patch (17% done in 25min, 100 more minutes
to finish - seems to be fast again predicted around 2hr, is consistent
with good speed before 5.6.16).

On Wed, Mar 9, 2022 at 12:35 AM Song Liu <song@xxxxxxxxxx> wrote:
>
> On Tue, Mar 8, 2022 at 2:51 PM Larkin Lowrey <llowrey@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Tue, Mar 8, 2022 at 3:50 PM Song Liu <song@xxxxxxxxxx> wrote:
> > > On Mon, Mar 7, 2022 at 10:21 AM Larkin Lowrey <llowrey@xxxxxxxxxxxxxxxxx> wrote:
> > >> I am seeing a 'check' speed regression between kernels 5.15 and 5.16.
> > >> One host with a 20 drive array went from 170MB/s to 11MB/s. Another host
> > >> with a 15 drive array went from 180MB/s to 43MB/s. In both cases the
> > >> arrays are almost completely idle. I can flip between the two kernels
> > >> with no other changes and observe the performance changes.
> > >>
> > >> Is this a known issue?
> > >
> > > I am not aware of this issue. Could you please share
> > >
> > >    mdadm --detail /dev/mdXXXX
> > >
> > > output of the array?
> > >
> > > Thanks,
> > > Song
> >
> > Host A:
> > # mdadm --detail /dev/md1
> > /dev/md1:
> >             Version : 1.2
> >       Creation Time : Thu Nov 19 18:21:44 2020
> >          Raid Level : raid6
> >          Array Size : 126961942016 (118.24 TiB 130.01 TB)
> >       Used Dev Size : 9766303232 (9.10 TiB 10.00 TB)
> >        Raid Devices : 15
> >       Total Devices : 15
> >         Persistence : Superblock is persistent
> >
> >       Intent Bitmap : Internal
> >
> >         Update Time : Tue Mar  8 12:39:14 2022
> >               State : clean
> >      Active Devices : 15
> >     Working Devices : 15
> >      Failed Devices : 0
> >       Spare Devices : 0
> >
> >              Layout : left-symmetric
> >          Chunk Size : 512K
> >
> > Consistency Policy : bitmap
> >
> >                Name : fubar:1  (local to host fubar)
> >                UUID : eaefc9b7:74af4850:69556e2e:bc05d666
> >              Events : 85950
> >
> >      Number   Major   Minor   RaidDevice State
> >         0       8        1        0      active sync   /dev/sda1
> >         1       8       17        1      active sync   /dev/sdb1
> >         2       8       33        2      active sync   /dev/sdc1
> >         3       8       49        3      active sync   /dev/sdd1
> >         4       8       65        4      active sync   /dev/sde1
> >         5       8       81        5      active sync   /dev/sdf1
> >        16       8       97        6      active sync   /dev/sdg1
> >         7       8      113        7      active sync   /dev/sdh1
> >         8       8      129        8      active sync   /dev/sdi1
> >         9       8      145        9      active sync   /dev/sdj1
> >        10       8      161       10      active sync   /dev/sdk1
> >        11       8      177       11      active sync   /dev/sdl1
> >        12       8      193       12      active sync   /dev/sdm1
> >        13       8      209       13      active sync   /dev/sdn1
> >        14       8      225       14      active sync   /dev/sdo1
> >
> > Host B:
> > # mdadm --detail /dev/md1
> > /dev/md1:
> >             Version : 1.2
> >       Creation Time : Thu Oct 10 14:18:16 2019
> >          Raid Level : raid6
> >          Array Size : 140650080768 (130.99 TiB 144.03 TB)
> >       Used Dev Size : 7813893376 (7.28 TiB 8.00 TB)
> >        Raid Devices : 20
> >       Total Devices : 20
> >         Persistence : Superblock is persistent
> >
> >       Intent Bitmap : Internal
> >
> >         Update Time : Tue Mar  8 17:40:48 2022
> >               State : clean
> >      Active Devices : 20
> >     Working Devices : 20
> >      Failed Devices : 0
> >       Spare Devices : 0
> >
> >              Layout : left-symmetric
> >          Chunk Size : 128K
> >
> > Consistency Policy : bitmap
> >
> >                Name : mcp:1
> >                UUID : 803f5eb5:e59d4091:5b91fa17:64801e54
> >              Events : 302158
> >
> >      Number   Major   Minor   RaidDevice State
> >         0       8        1        0      active sync   /dev/sda1
> >         1      65      145        1      active sync   /dev/sdz1
> >         2      65      177        2      active sync   /dev/sdab1
> >         3      65      209        3      active sync   /dev/sdad1
> >         4       8      209        4      active sync   /dev/sdn1
> >         5      65      129        5      active sync   /dev/sdy1
> >         6       8      241        6      active sync   /dev/sdp1
> >         7      65      241        7      active sync   /dev/sdaf1
> >         8       8      161        8      active sync   /dev/sdk1
> >         9       8      113        9      active sync   /dev/sdh1
> >        10       8      129       10      active sync   /dev/sdi1
> >        11      66       33       11      active sync   /dev/sdai1
> >        12      65        1       12      active sync   /dev/sdq1
> >        13       8       65       13      active sync   /dev/sde1
> >        14      66       17       14      active sync   /dev/sdah1
> >        15       8       49       15      active sync   /dev/sdd1
> >        19      66       81       16      active sync   /dev/sdal1
> >        16      66       65       17      active sync   /dev/sdak1
> >        17       8      145       18      active sync   /dev/sdj1
> >        18      66      129       19      active sync   /dev/sdao1
> >
> > The regression was introduced somewhere between these two Fedora kernels:
> > 5.15.18-200 (good)
> > 5.16.5-200 (bad)
>
> Hi folks,
>
> Sorry for the regression and thanks for sharing your array setup and
> observations.
>
> I think I have found the fix for it. I will send a patch for it. If
> you want to try the fix
> sooner, you can find it at:
>
> For 5.16:
> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=tmp/fix-5.16&id=872c1a638b9751061b11b64a240892c989d1c618
>
> For 5.17:
> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=tmp/fix-5.17&id=c06ccb305e697d89fe99376c9036d1a2ece44c77
>
> Thanks,
> Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux