Re: Massive performance issues

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Fri, 14 Mar 2025 09:03:03 +0100 (CET)

Hi folks,

I would also run an iostat -dmx 1 on host 'lenticular' during the fio benchmark just to make sure osd.10 is not being badly hammered with I/Os, which could be capping the cluster's HDD performance due to the very high number of PGs this OSD is involved in.

> 10    hdd   3.63869   1.00000  3.6 TiB  1.9 TiB  1.9 TiB  127 MiB   5.9 GiB  1.7 TiB  52.08  1.19  500      up          osd.10     

Regards,
Frédéric.

----- Le 14 Mar 25, à 8:40, joachim kraftmayer joachim.kraftmayer@xxxxxxxxx a écrit :

> Hi Thomas & Anthony,
> 
> Anthony provided great recommendations.
> 
> ssd read performance:
> I find the total number of pg per ssd osd too low; it can be twice as high.
> 
> hdd read performance
> What makes me a little suspicious is that the maximum throughput of about
> 120 MB/s is exactly the maximum of a 1 Gbit/s connection.
> (I have seen this in the past if the routing is not correct and if you use
> VMs for testing the network could be limited.)
> Can you run your performance test with higher num_jobs, e.g. 16 and post
> the results.
> 
> Regards, Joachim
> 
>  joachim.kraftmayer@xxxxxxxxx
> 
>  www.clyso.com
> 
>  Hohenzollernstr. 27, 80801 Munich
> 
> Utting | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE275430677
> 
> 
> 
> Am Fr., 14. März 2025 um 01:53 Uhr schrieb Anthony D'Atri <
> anthony.datri@xxxxxxxxx>:
> 
>>
>> > Hi,
>> >
>> > we are having massive performance issues with our Ceph cluster, and by
>> > now I have no idea how and where to debug further.
>> >
>> > See attachments
>>
>> I see inline text.  If you attached files, those don’t propagate through
>> the list.
>>
>> > for full benchmark logs, relevant excerpts:
>> >
>> > HDD pool:
>> >> read: IOPS=7591, BW=119MiB/s (124MB/s)(34.9GiB/300892msec)
>> >
>> > SSD pool:
>> >> read: IOPS=2007, BW=31.4MiB/s (32.9MB/s)(9422MiB/300334msec)
>> >
>> > Yes, the SSD pool is in fact slower in both IOPS and data rate than the
>> > HDD pool.
>>
>> Making some assumptions based on the below:
>>
>> * Twice as many HDD OSDs as SSD OSDs, so twice the ostensible potential
>> for parallelism
>> * The 5210 ION is QLC from 6 years ago.  Its performance is significantly
>> lower than your Samsungs and especially the 5300s
>> * Your pg_num values are probably fine.
>> * Looks like you are benching against a KRBD mount, so I can’t tell which
>> pool that’s against, or which media your pools are using.  Note that your
>> EC RBD pool is going to be inherently slow with high latency.
>> * An RBD pool will be limited by its slowest OSDs.  Half of your SSDs are
>> 5210s.  So … oh man this is triggering my combinatorics PTSD … many of your
>> PGs in RBD pools using the SSDs will include at least one of them.  The
>> proportion depends on which systems they’re in, which we can’t readily tell
>> from the supplied data.
>> * being respectful of your resources, your systems as described are pretty
>> light CPU-wise for their duties, especially if those collocated VMs are
>> using much at all.
>> * The OSDs on Pileus especially are light on available RAM.  And those
>> include some of the SSDs.
>> * The RAM usage of VMs on the other systems may similarly be starving
>> their OSDs.
>>
>> If you could post `ceph osd metadata` somewhere and send the list a link
>> we can be more specific.
>>
>> Same with `ceph osd crush dump`
>> —aad
>>
>>
>>
>> > Of course this is only one particular benchmark scenario, but
>> > at least it has some numbers and not just “everything feels slow”. I’m
>> > happy to run different benchmarks if required.
>> >
>> > Cluster information:
>> > * 2x10G ethernet to a Cisco Catalyst 3850 on each node
>> > * Debian 11 (bullseye), Linux 5.10
>> > * Ceph 18.2.4 (reef), managed with cephadm
>> > * All but the oldest node serve as VM hypervisors (KVM) as well
>> > * 6 nodes (hostnames appended to compare with OSD information later)
>> >    - 2x AMD EPYC 7232P, 64G RAM (cirrus, nimbus)
>> >    - 1x Intel Xeon E3-1230 v3, 32G RAM (pileus, no VMs)
>> >    - 2x Intel Xeon E5-1620 v4, 64G RAM (lenticular, nacreous)
>> >    - 1x Intel Xeon Silver 4110, 96G RAM (stratus)
>> > * Cluster usage: mostly RBD, some RGW and CephFS
>> > * OSDs:
>> >    - 19 Hitachi Ultrastar 7K4000 2TB
>> >    - 1 WD Ultrastar DC HC310 4TB
>> >    - 3 Hitachi Deskstar 7k2000 2TB
>> >    - 13 Hitachi Ultrastar A7K3000 2TB
>> >    - 8 Micron 5210 ION 2TB
>> >    - 4 Micron 5300 MAX - Mixed Use 2TB
>> >    - 5 Samsung Datacenter PM893 2TB
>> >
>> > You can find various information that I deemed useful below. Please ask
>> > if you would like further information.
>> >
>> > As I said, I don’t really know where to look and what to do, so I would
>> > really appreciate any pointers on how to debug this and improve
>> > performance.
>> >
>> > Of course I have already seen the pg warnings, but I am really not sure
>> > what to adjust in which direction, especially the contradiction between
>> > the MANY_OBJECTS_PER_PG and POOL_TOO_MANY_PGS warnings for the
>> > default.rgw.ec.data pool (there are too many objects per pg, so I
>> > should probably increase pg_num, however it recommends to scale down
>> > from 128 to 32?!).
>> >
>> > Thanks,
>> > Thomas
>> >
>> > # ceph -s
>> >  cluster:
>> >    id:     91688ac0-1b4f-43e3-913d-a844338d9325
>> >    health: HEALTH_WARN
>> >            1 pools have many more objects per pg than average
>> >            3 pools have too many placement groups
>> >
>> >  services:
>> >    mon: 5 daemons, quorum nimbus,lenticular,stratus,cirrus,nacreous (age
>> 5d)
>> >    mgr: lenticular(active, since 3M), standbys: nimbus, stratus,
>> nacreous, cirrus
>> >    mds: 1/1 daemons up, 1 standby
>> >    osd: 53 osds: 53 up (since 2w), 53 in (since 18M)
>> >    rgw: 5 daemons active (5 hosts, 1 zones)
>> >
>> >  data:
>> >    volumes: 1/1 healthy
>> >    pools:   15 pools, 2977 pgs
>> >    objects: 7.20M objects, 15 TiB
>> >    usage:   42 TiB used, 55 TiB / 97 TiB avail
>> >    pgs:     2975 active+clean
>> >             1    active+clean+scrubbing
>> >             1    active+clean+scrubbing+deep
>> >
>> >  io:
>> >    client:   6.5 MiB/s rd, 1.3 MiB/s wr, 158 op/s rd, 63 op/s wr
>> >
>> > # ceph health detail
>> > HEALTH_WARN 1 pools have many more objects per pg than average; 3 pools
>> have too many placement groups
>> > [WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than
>> average
>> >    pool default.rgw.ec.data objects per pg (27197) is more than 11.2524
>> times cluster average (2417)
>> > [WRN] POOL_TOO_MANY_PGS: 3 pools have too many placement groups
>> >    Pool default.rgw.ec.data has 128 placement groups, should have 32
>> >    Pool templates has 512 placement groups, should have 256
>> >    Pool rbd.ec has 1024 placement groups, should have 256
>> >
>> > # ceph df
>> > --- RAW STORAGE ---
>> > CLASS    SIZE   AVAIL     USED  RAW USED  %RAW USED
>> > hdd    67 TiB  29 TiB   39 TiB    39 TiB      57.58
>> > ssd    30 TiB  26 TiB  3.7 TiB   3.7 TiB      12.37
>> > TOTAL  97 TiB  55 TiB   42 TiB    42 TiB      43.74
>> >
>> > --- POOLS ---
>> > POOL                        ID   PGS   STORED  OBJECTS     USED  %USED
>> MAX AVAIL
>> > templates                    1   512  4.4 TiB    1.27M   13 TiB  37.46
>>   7.4 TiB
>> > .rgw.root                    9    32  1.4 KiB        4  768 KiB      0
>>   7.4 TiB
>> > default.rgw.control         10    32      0 B        8      0 B      0
>>   7.4 TiB
>> > default.rgw.meta            11    32  7.0 KiB       38  5.1 MiB      0
>>   7.4 TiB
>> > default.rgw.log             12    32  3.5 KiB      208  5.7 MiB      0
>>   7.4 TiB
>> > default.rgw.ec.data         14   128  1.3 TiB    3.48M  2.7 TiB  10.95
>>    13 TiB
>> > default.rgw.buckets.index   15    32   57 MiB       53  171 MiB      0
>>   7.4 TiB
>> > default.rgw.buckets.data    16    32  2.5 GiB   17.85k  8.3 GiB   0.04
>>   7.4 TiB
>> > default.rgw.buckets.non-ec  17    32  1.6 KiB        1  197 KiB      0
>>   7.4 TiB
>> > rbd.ssd                     19   512  872 GiB  259.98k  2.6 TiB   9.46
>>   8.1 TiB
>> > .mgr                        20     1  640 MiB      161  1.9 GiB      0
>>   7.4 TiB
>> > cephfs_metadata             21    32  404 MiB   35.40k  1.2 GiB      0
>>   8.1 TiB
>> > cephfs_data                 22    32   42 GiB   28.09k  128 GiB   0.56
>>   7.4 TiB
>> > rbd                         23   512   12 GiB    6.57k   39 GiB   0.17
>>   7.4 TiB
>> > rbd.ec                      24  1024    8 TiB    2.10M   16 TiB  42.48
>>    11 TiB
>> >
>> > The pool "templates" is named as such for historical reasons. It is
>> > used as RBD VM image storage.
>> >
>> > # ceph osd pool autoscale-status
>> > POOL                          SIZE  TARGET SIZE                RATE  RAW
>> CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM
>> AUTOSCALE  BULK
>> > .rgw.root                   256.0k                              3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > default.rgw.control             0                               3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > default.rgw.meta             1740k                              3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > default.rgw.log              1932k                              3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > default.rgw.ec.data          1678G               1.6666666269302368
>>   68931G  0.0406                                  1.0     128
>> warn       False
>> > default.rgw.buckets.index   58292k                              3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > default.rgw.buckets.data     2828M                              3.0
>>   68931G  0.0001                                  1.0      32
>> off        False
>> > default.rgw.buckets.non-ec  67159                               3.0
>>   68931G  0.0000                                  1.0      32
>> off        False
>> > .mgr                        639.7M                              3.0
>>   68931G  0.0000                                  1.0       1
>> off        False
>> > cephfs_metadata             405.4M                              3.0
>>   30404G  0.0000                                  4.0      32
>> off        False
>> > cephfs_data                 43733M                              3.0
>>   68931G  0.0019                                  1.0      32
>> off        False
>> > templates                    4541G                              3.0
>>   68931G  0.1977                                  1.0     512
>> warn       True
>> > rbd.ssd                     870.4G                              3.0
>>   30404G  0.0859                                  1.0     512
>> warn       True
>> > rbd                         13336M                              3.0
>>   68931G  0.0006                                  1.0     512
>> off        True
>> > rbd.ec                       8401G                              2.0
>>     68931G  0.2438                                  1.0    1024
>>   warn       True
>> >
>> > # ceph osd pool ls detail
>> > pool 1 'templates' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 7495052
>> lfor 0/2976944/7469772 flags hashpspool,selfmanaged_snaps,bulk stripe_width
>> 0 application rbd read_balance_score 1.48
>> > pool 9 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 7467477 flags
>> hashpspool stripe_width 0 application rgw read_balance_score 3.38
>> > pool 10 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change
>> 7467478 flags hashpspool stripe_width 0 application rgw read_balance_score
>> 3.37
>> > pool 11 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change
>> 7467479 flags hashpspool stripe_width 0 application rgw read_balance_score
>> 3.36
>> > pool 12 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change
>> 7467480 flags hashpspool stripe_width 0 application rgw read_balance_score
>> 4.52
>> > pool 14 'default.rgw.ec.data' erasure profile rgw-video size 5 min_size
>> 4 crush_rule 3 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode
>> warn last_change 7494999 lfor 0/5467738/7467533 flags hashpspool
>> stripe_width 12288 application rgw
>> > pool 15 'default.rgw.buckets.index' replicated size 3 min_size 2
>> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off
>> last_change 7467482 flags hashpspool stripe_width 0 application rgw
>> read_balance_score 4.52
>> > pool 16 'default.rgw.buckets.data' replicated size 3 min_size 2
>> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off
>> last_change 7467483 flags hashpspool stripe_width 0 application rgw
>> read_balance_score 3.39
>> > pool 17 'default.rgw.buckets.non-ec' replicated size 3 min_size 2
>> crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off
>> last_change 7467484 flags hashpspool stripe_width 0 application rgw
>> read_balance_score 3.38
>> > pool 19 'rbd.ssd' replicated size 3 min_size 2 crush_rule 4 object_hash
>> rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 7495053
>> lfor 0/141997/7467030 flags hashpspool,selfmanaged_snaps,bulk stripe_width
>> 0 application rbd read_balance_score 1.43
>> > pool 20 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 7480977 flags
>> hashpspool stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth
>> read_balance_score 37.50
>> > pool 21 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 4
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change
>> 7467487 lfor 0/0/96365 flags hashpspool stripe_width 0 pg_autoscale_bias 4
>> pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.59
>> > pool 22 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
>> object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change
>> 7467488 lfor 0/0/96365 flags hashpspool,selfmanaged_snaps stripe_width 0
>> application cephfs read_balance_score 4.52
>> > pool 23 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash
>> rjenkins pg_num 512 pgp_num 512 autoscale_mode off last_change 7467489 lfor
>> 0/0/884114 flags hashpspool,selfmanaged_snaps,bulk stripe_width 0
>> application rbd read_balance_score 1.41
>> > pool 24 'rbd.ec' erasure profile rbd size 4 min_size 3 crush_rule 5
>> object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn
>> last_change 7495054 lfor 0/2366959/2366961 flags
>> hashpspool,ec_overwrites,selfmanaged_snaps,bulk stripe_width 8192
>> application rbd
>> >
>> > # ceph osd utilization
>> > avg 192.66
>> > stddev 75.9851 (expected baseline 13.7486)
>> > min osd.48 with 88 pgs (0.456762 * mean)
>> > max osd.10 with 500 pgs (2.59524 * mean)
>> >
>> > # ceph osd df tree
>> > ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP
>>  META      AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME
>> > -13         95.71759         -   97 TiB   42 TiB   42 TiB  4.2 GiB   204
>> GiB   55 TiB  43.74  1.00    -          root default
>> > -16         14.15759         -   15 TiB  8.2 TiB  8.2 TiB  713 MiB    29
>> GiB  6.4 TiB  56.36  1.29    -              host cirrus
>> >  2    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.0 TiB   85 MiB   4.9
>> GiB  784 GiB  57.94  1.32  223      up          osd.2
>> >  5    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   50 MiB   3.2
>> GiB  821 GiB  55.95  1.28  223      up          osd.5
>> >  8    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   88 MiB   2.9
>> GiB  804 GiB  56.86  1.30  229      up          osd.8
>> > 11    hdd   1.76970   1.00000  1.8 TiB  963 GiB  959 GiB  236 MiB   3.6
>> GiB  900 GiB  51.66  1.18  240      up          osd.11
>> > 13    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   61 MiB   4.7
>> GiB  823 GiB  55.81  1.28  220      up          osd.13
>> > 15    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   52 MiB   3.1
>> GiB  762 GiB  59.13  1.35  229      up          osd.15
>> > 18    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.0 TiB   85 MiB   2.7
>> GiB  787 GiB  57.76  1.32  228      up          osd.18
>> > 19    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   54 MiB   3.4
>> GiB  824 GiB  55.79  1.28  219      up          osd.19
>> > -18         15.93417         -   16 TiB  6.0 TiB  6.0 TiB  663 MiB    23
>> GiB   10 TiB  37.15  0.85    -              host lenticular
>> >  0    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   88 MiB   3.2
>> GiB  796 GiB  57.26  1.31  229      up          osd.0
>> >  4    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   42 MiB   6.0
>> GiB  753 GiB  59.57  1.36  231      up          osd.4
>> >  6    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   3.5
>> GiB  758 GiB  59.31  1.36  237      up          osd.6
>> > 10    hdd   3.63869   1.00000  3.6 TiB  1.9 TiB  1.9 TiB  127 MiB   5.9
>> GiB  1.7 TiB  52.08  1.19  500      up          osd.10
>> > 36    ssd   1.74660   1.00000  1.7 TiB  229 GiB  228 GiB   92 MiB   959
>> MiB  1.5 TiB  12.79  0.29  103      up          osd.36
>> > 37    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   97 MiB  1006
>> MiB  1.5 TiB  12.49  0.29   98      up          osd.37
>> > 38    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   61 MiB   1.0
>> GiB  1.5 TiB  12.46  0.28   98      up          osd.38
>> > 39    ssd   1.74660   1.00000  1.7 TiB  221 GiB  219 GiB  102 MiB   1.4
>> GiB  1.5 TiB  12.35  0.28   98      up          osd.39
>> > -17         22.89058         -   23 TiB  9.5 TiB  9.4 TiB  1.1 GiB    45
>> GiB   14 TiB  40.59  0.93    -              host nacreous
>> >  1    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB  124 MiB   6.0
>> GiB  818 GiB  56.07  1.28  229      up          osd.1
>> >  3    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   3.1
>> GiB  780 GiB  58.11  1.33  222      up          osd.3
>> >  7    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   48 MiB   3.1
>> GiB  773 GiB  58.48  1.34  237      up          osd.7
>> >  9    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   50 MiB   4.9
>> GiB  766 GiB  58.88  1.35  223      up          osd.9
>> > 14    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   93 MiB   3.3
>> GiB  812 GiB  56.43  1.29  224      up          osd.14
>> > 16    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   43 MiB   5.0
>> GiB  792 GiB  57.51  1.31  235      up          osd.16
>> > 17    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   68 MiB   4.2
>> GiB  777 GiB  58.31  1.33  227      up          osd.17
>> > 20    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   66 MiB   5.9
>> GiB  780 GiB  58.15  1.33  231      up          osd.20
>> > 48    ssd   1.74660   1.00000  1.7 TiB  213 GiB  212 GiB   94 MiB   1.3
>> GiB  1.5 TiB  11.92  0.27   88      up          osd.48
>> > 49    ssd   1.74660   1.00000  1.7 TiB  215 GiB  212 GiB   94 MiB   2.5
>> GiB  1.5 TiB  12.01  0.27   90      up          osd.49
>> > 50    ssd   1.74660   1.00000  1.7 TiB  214 GiB  212 GiB   95 MiB   2.1
>> GiB  1.5 TiB  11.99  0.27   92      up          osd.50
>> > 51    ssd   1.74660   1.00000  1.7 TiB  217 GiB  215 GiB  144 MiB   1.9
>> GiB  1.5 TiB  12.13  0.28   94      up          osd.51
>> > 52    ssd   1.74660   1.00000  1.7 TiB  214 GiB  212 GiB  144 MiB   2.0
>> GiB  1.5 TiB  11.95  0.27   94      up          osd.52
>> > -19          6.98639         -  7.0 TiB  895 GiB  889 GiB  345 MiB   5.6
>> GiB  6.1 TiB  12.51  0.29    -              host nimbus
>> > 44    ssd   1.74660   1.00000  1.7 TiB  227 GiB  224 GiB  148 MiB   2.3
>> GiB  1.5 TiB  12.67  0.29   98      up          osd.44
>> > 45    ssd   1.74660   1.00000  1.7 TiB  222 GiB  221 GiB   68 MiB  1021
>> MiB  1.5 TiB  12.39  0.28   93      up          osd.45
>> > 46    ssd   1.74660   1.00000  1.7 TiB  223 GiB  222 GiB   64 MiB   972
>> MiB  1.5 TiB  12.48  0.29   97      up          osd.46
>> > 47    ssd   1.74660   1.00000  1.7 TiB  224 GiB  222 GiB   64 MiB   1.3
>> GiB  1.5 TiB  12.52  0.29   95      up          osd.47
>> > -15         21.19368         -   22 TiB  9.3 TiB  9.3 TiB  848 MiB    56
>> GiB   12 TiB  43.35  0.99    -              host pileus
>> > 12    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   54 MiB   8.7
>> GiB  790 GiB  57.61  1.32  231      up          osd.12
>> > 21    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   46 MiB   6.1
>> GiB  765 GiB  58.94  1.35  236      up          osd.21
>> > 22    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   60 MiB   6.2
>> GiB  810 GiB  56.52  1.29  228      up          osd.22
>> > 23    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   83 MiB   8.0
>> GiB  779 GiB  58.16  1.33  230      up          osd.23
>> > 24    hdd   1.76970   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   44 MiB   5.5
>> GiB  777 GiB  58.29  1.33  229      up          osd.24
>> > 25    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   73 MiB   5.6
>> GiB  799 GiB  57.09  1.31  230      up          osd.25
>> > 26    hdd   1.76970   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   63 MiB   5.4
>> GiB  797 GiB  57.22  1.31  239      up          osd.26
>> > 27    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   99 MiB   7.1
>> GiB  720 GiB  61.36  1.40  247      up          osd.27
>> > 40    ssd   1.74660   1.00000  1.7 TiB  220 GiB  219 GiB   66 MiB  1013
>> MiB  1.5 TiB  12.31  0.28   96      up          osd.40
>> > 41    ssd   1.74660   1.00000  1.7 TiB  221 GiB  220 GiB   63 MiB   981
>> MiB  1.5 TiB  12.36  0.28   97      up          osd.41
>> > 42    ssd   1.74660   1.00000  1.7 TiB  228 GiB  227 GiB   98 MiB   999
>> MiB  1.5 TiB  12.73  0.29  101      up          osd.42
>> > 43    ssd   1.74660   1.00000  1.7 TiB  227 GiB  226 GiB   99 MiB   866
>> MiB  1.5 TiB  12.67  0.29  100      up          osd.43
>> > -14         14.55518         -   15 TiB  8.6 TiB  8.5 TiB  582 MiB    45
>> GiB  6.0 TiB  59.02  1.35    -              host stratus
>> > 28    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   54 MiB   5.5
>> GiB  756 GiB  59.41  1.36  239      up          osd.28
>> > 29    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   44 MiB   4.7
>> GiB  759 GiB  59.27  1.35  230      up          osd.29
>> > 30    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   68 MiB   5.0
>> GiB  753 GiB  59.57  1.36  232      up          osd.30
>> > 31    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   64 MiB   6.2
>> GiB  778 GiB  58.26  1.33  229      up          osd.31
>> > 32    hdd   1.81940   1.00000  1.8 TiB  1.0 TiB  1.0 TiB   85 MiB   5.0
>> GiB  797 GiB  57.24  1.31  226      up          osd.32
>> > 33    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB  112 MiB   4.1
>> GiB  749 GiB  59.81  1.37  235      up          osd.33
>> > 34    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB   46 MiB   8.5
>> GiB  770 GiB  58.66  1.34  238      up          osd.34
>> > 35    hdd   1.81940   1.00000  1.8 TiB  1.1 TiB  1.1 TiB  109 MiB   6.2
>> GiB  747 GiB  59.92  1.37  244      up          osd.35
>> >                         TOTAL   97 TiB   42 TiB   42 TiB  4.2 GiB   204
>> GiB   55 TiB  43.74
>> > MIN/MAX VAR: 0.27/1.40  STDDEV: 21.24
>> >
>> > # ceph balancer status
>> > {
>> >    "active": true,
>> >    "last_optimize_duration": "0:00:00.040070",
>> >    "last_optimize_started": "Thu Mar 13 14:59:05 2025",
>> >    "mode": "upmap",
>> >    "no_optimization_needed": true,
>> >    "optimize_result": "Unable to find further optimization, or pool(s)
>> pg_num is decreasing, or distribution is already perfect",
>> >    "plans": []
>> > }
>> >
>> > --
>> > Fachschaft I/1 Mathematik/Physik/Informatik der RWTH Aachen Thomas
>> > Schneider Campus Mitte: Augustinerbach 2a, 52062 Aachen Telefon: +49
>> > 241 80 94506 Informatikzentrum: Ahornstraße 55, Raum 4U17, 52074 Aachen
>> > Telefon: +49 241 80 26741 https://www.fsmpi.rwth-aachen.de
>> > # fio --ioengine=libaio --direct=1 --bs=16384 --iodepth=128
>> --rw=randread --norandommap --size=20G --numjobs=1 --runtime=300
>> --time_based --name=/dev/rbd/rbd/test
>> > /dev/rbd/rbd/test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W)
>> 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=128
>> > fio-3.25
>> > Starting 1 process
>> > Jobs: 1 (f=1): [r(1)][100.0%][r=35.0MiB/s][r=2241 IOPS][eta 00m:00s]
>> > /dev/rbd/rbd/test: (groupid=0, jobs=1): err= 0: pid=3832471: Thu Jan 30
>> 19:55:46 2025
>> >  read: IOPS=7591, BW=119MiB/s (124MB/s)(34.9GiB/300892msec)
>> >    slat (usec): min=2, max=665, avg= 8.93, stdev= 5.57
>> >    clat (usec): min=65, max=5330.5k, avg=16850.13, stdev=184056.88
>> >     lat (usec): min=82, max=5330.5k, avg=16859.23, stdev=184056.97
>> >    clat percentiles (usec):
>> >     |  1.00th=[    147],  5.00th=[    180], 10.00th=[    202],
>> >     | 20.00th=[    237], 30.00th=[    269], 40.00th=[    297],
>> >     | 50.00th=[    330], 60.00th=[    367], 70.00th=[    420],
>> >     | 80.00th=[    523], 90.00th=[   1876], 95.00th=[   6128],
>> >     | 99.00th=[ 362808], 99.50th=[1098908], 99.90th=[3506439],
>> >     | 99.95th=[4143973], 99.99th=[4731175]
>> >   bw (  KiB/s): min=  768, max=478432, per=100.00%, avg=121819.01,
>> stdev=95962.09, samples=600
>> >   iops        : min=   48, max=29902, avg=7613.70, stdev=5997.62,
>> samples=600
>> >  lat (usec)   : 100=0.01%, 250=24.52%, 500=53.85%, 750=8.61%, 1000=1.56%
>> >  lat (msec)   : 2=1.84%, 4=3.46%, 10=2.34%, 20=1.12%, 50=0.91%
>> >  lat (msec)   : 100=0.40%, 250=0.31%, 500=0.13%, 750=0.18%, 1000=0.21%
>> >  lat (msec)   : 2000=0.34%, >=2000=0.20%
>> >  cpu          : usr=2.33%, sys=10.06%, ctx=1687235, majf=0, minf=523
>> >  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
>> >=64=100.0%
>> >     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> >     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.1%
>> >     issued rwts: total=2284212,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>> >     latency   : target=0, window=0, percentile=100.00%, depth=128
>> >
>> > Run status group 0 (all jobs):
>> >   READ: bw=119MiB/s (124MB/s), 119MiB/s-119MiB/s (124MB/s-124MB/s),
>> io=34.9GiB (37.4GB), run=300892-300892msec
>> > <fio-ssd.txt>
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx