Re: Huge lock contention during raid5 build time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> We need major work to make it faster so that we can keep up with
> the speed of modern SSDs.

Glad to know that this in your roadmap.
This is very important for storage server solutions, when you can add
ten's NVMe SSDs Gen 4/5 in 2U server.
I'm not a developer, but I can assist you in the testing as much as required.

> Could you please do a perf-record with '-g' so that we can see
> which call paths hit the lock contention? This will help us
> understand whether Shushu's bitmap optimization can help.

default raid5 build performance

[root@memverge2 ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 nvme0n1[4] nvme2n1[2] nvme3n1[1] nvme4n1[0]
      4688044032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [>....................]  recovery =  0.3% (5601408/1562681344)
finish=125.0min speed=207459K/sec
      bitmap: 0/12 pages [0KB], 65536KB chunk

after set

[root@memverge2 md]# echo 8 > group_thread_cnt
[root@memverge2 md]# echo 3600000 > sync_speed_max

[root@memverge2 ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 nvme0n1[4] nvme2n1[2] nvme3n1[1] nvme4n1[0]
      4688044032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [=>...................]  recovery =  7.9% (124671408/1562681344)
finish=16.6min speed=1435737K/sec
      bitmap: 0/12 pages [0KB], 65536KB chunk

perf.data.gz attached.

Anton

чт, 23 янв. 2025 г. в 20:01, Song Liu <song@xxxxxxxxxx>:
>
> Hi Anton,
>
> Thanks for the report.
>
> On Thu, Jan 23, 2025 at 5:56 AM Anton Gavriliuk <antosha20xx@xxxxxxxxx> wrote:
> >
> > Hi
> >
> > I'm building mdadm raid5 (3+1), based on Intel's NVMe SSD P4600.
> >
> > Mdadm next version
> >
> > [root@memverge2 ~]# /home/anton/mdadm/mdadm --version
> > mdadm - v4.4-13-ge0df6c4c - 2025-01-17
> >
> > Maximum performance I saw ~1.4 GB/s.
> >
> > [root@memverge2 md]# cat /proc/mdstat
> > Personalities : [raid6] [raid5] [raid4]
> > md0 : active raid5 nvme0n1[4] nvme2n1[2] nvme3n1[1] nvme4n1[0]
> >       4688044032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
> >       [==============>......]  recovery = 71.8%
> > (1122726044/1562681344) finish=5.1min speed=1428101K/sec
> >       bitmap: 0/12 pages [0KB], 65536KB chunk
>
> Given the rebuild speed of 1.4GB/s, which is pretty fast,  I do
> not think this is a regression. Lock contentions in raid5 stack,
> including but not limited to the bitmap, is a known issue. We
> need major work to make it faster so that we can keep up with
> the speed of modern SSDs.
>
> >
> > Perf top shows huge spinlock contention
> >
> > Samples: 180K of event 'cycles:P', 4000 Hz, Event count (approx.):
> > 175146370188 lost: 0/0 drop: 0/0
> > Overhead  Shared Object                             Symbol
> >   38.23%  [kernel]                                  [k]
> > native_queued_spin_lock_slowpath
> >    8.33%  [kernel]                                  [k] analyse_stripe
> >    6.85%  [kernel]                                  [k] ops_run_io
> >    3.95%  [kernel]                                  [k] intel_idle_irq
> >    3.41%  [kernel]                                  [k] xor_avx_4
> >    2.76%  [kernel]                                  [k] handle_stripe
> >    2.37%  [kernel]                                  [k] raid5_end_read_request
> >    1.97%  [kernel]                                  [k] find_get_stripe
>
> Could you please do a perf-record with '-g' so that we can see
> which call paths hit the lock contention? This will help us
> understand whether Shushu's bitmap optimization can help.
>
> Thanks,
> Song

Attachment: perf.data.gz
Description: GNU Zip compressed data


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux