Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

Özkan Göksu <ozkangksu@xxxxxxxxx> · Thu, 25 Jan 2024 22:20:05 +0300

This is client side metrics from a "failing to respond to cache pressure"
warned client.

root@datagen-27:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1282187#
cat bdi/stats
BdiWriteback:                0 kB
BdiReclaimable:              0 kB
BdiDirtyThresh:              0 kB
DirtyThresh:          35979376 kB
BackgroundThresh:     17967720 kB
BdiDirtied:            3071616 kB
BdiWritten:            3036864 kB
BdiWriteBandwidth:          20 kBps
b_dirty:                     0
b_io:                        0
b_more_io:                   0
b_dirty_time:                0
bdi_list:                    1
state:                       1

------------------------------------------------

root@d27:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1282187#
cat metrics
item                               total
------------------------------------------
opened files  / total inodes       4 / 14129
pinned i_caps / total inodes       14129 / 14129
opened inodes / total inodes       2 / 14129

item          total       avg_lat(us)     min_lat(us)     max_lat(us)
stdev(us)
-----------------------------------------------------------------------------------
read          1218753     3116            208             8741271
2154
write         34945       24003           3017            2191493
16156
metadata      1703642     8395            127             17936115
 1497

item          total       avg_sz(bytes)   min_sz(bytes)   max_sz(bytes)
 total_sz(bytes)
----------------------------------------------------------------------------------------
read          1218753     227009          1               4194304
276668475618
write         34945       85860           1               4194304
3000382055

item          total           miss            hit
-------------------------------------------------
d_lease       306             19110           3317071969
caps          14129           145404          3761682333

Özkan Göksu <ozkangksu@xxxxxxxxx>, 25 Oca 2024 Per, 20:25 tarihinde şunu
yazdı:

> Every user has a 1x subvolume and I only have 1 pool.
> At the beginning we were using each subvolume for ldap home directory +
> user data.
> When a user logins any docker on any host, it was using the cluster for
> home and the for user related data, we was have second directory in the
> same subvolume.
> Time to time users were feeling a very slow home environment and after a
> month it became almost impossible to use home. VNC sessions became
> unresponsive and slow etc.
>
> 2 weeks ago, I had to migrate home to a ZFS storage and now the overall
> performance is better for only user_data without home.
> But still the performance is not good enough as I expected because of the
> problems related to MDS.
> The usage is low but allocation is high and Cpu usage is high. You saw the
> IO Op/s, it's nothing but allocation is high.
>
> I develop a fio benchmark script and I run the script on 4x test server at
> the same time, the results are below:
> Script:
> https://github.com/ozkangoksu/benchmark/blob/8f5df87997864c25ef32447e02fcd41fda0d2a67/iobench.sh
>
>
> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-01.txt
>
> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-02.txt
>
> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-03.txt
>
> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-04.txt
>
> While running benchmark, I take sample values for each type of iobench run.
>
> Seq Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32
>     client:   70 MiB/s rd, 762 MiB/s wr, 337 op/s rd, 24.41k op/s wr
>     client:   60 MiB/s rd, 551 MiB/s wr, 303 op/s rd, 35.12k op/s wr
>     client:   13 MiB/s rd, 161 MiB/s wr, 101 op/s rd, 41.30k op/s wr
>
> Seq Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32
>     client:   1.6 GiB/s rd, 219 KiB/s wr, 28.76k op/s rd, 89 op/s wr
>     client:   370 MiB/s rd, 475 KiB/s wr, 90.38k op/s rd, 89 op/s wr
>
> Rand Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32
>     client:   63 MiB/s rd, 1.5 GiB/s wr, 8.77k op/s rd, 5.50k op/s wr
>     client:   14 MiB/s rd, 1.8 GiB/s wr, 81 op/s rd, 13.86k op/s wr
>     client:   6.6 MiB/s rd, 1.2 GiB/s wr, 61 op/s rd, 30.13k op/s wr
>
> Rand Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32
>     client:   317 MiB/s rd, 841 MiB/s wr, 426 op/s rd, 10.98k op/s wr
>     client:   2.8 GiB/s rd, 882 MiB/s wr, 25.68k op/s rd, 291 op/s wr
>     client:   4.0 GiB/s rd, 226 MiB/s wr, 89.63k op/s rd, 124 op/s wr
>     client:   2.4 GiB/s rd, 295 KiB/s wr, 197.86k op/s rd, 20 op/s wr
>
> It seems I only have problems with the 4K,8K,16K other sector sizes.
>
>
>
>
> Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 19:06 tarihinde şunu yazdı:
>
>> I understand that your MDS shows a high CPU usage, but other than that
>> what is your performance issue? Do users complain? Do some operations
>> take longer than expected? Are OSDs saturated during those phases?
>> Because the cache pressure messages don’t necessarily mean that users
>> will notice.
>> MDS daemons are single-threaded so that might be a bottleneck. In that
>> case multi-active mds might help, which you already tried and
>> experienced OOM killers. But you might have to disable the mds
>> balancer as someone else mentioned. And then you could think about
>> pinning, is it possible to split the CephFS into multiple
>> subdirectories and pin them to different ranks?
>> But first I’d still like to know what the performance issue really is.
>>
>> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>:
>>
>> > I will try my best to explain my situation.
>> >
>> > I don't have a separate mds server. I have 5 identical nodes, 3 of them
>> > mons, and I use the other 2 as active and standby mds. (currently I have
>> > left overs from max_mds 4)
>> >
>> > root@ud-01:~# ceph -s
>> >   cluster:
>> >     id:     e42fd4b0-313b-11ee-9a00-31da71873773
>> >     health: HEALTH_WARN
>> >             1 clients failing to respond to cache pressure
>> >
>> >   services:
>> >     mon: 3 daemons, quorum ud-01,ud-02,ud-03 (age 9d)
>> >     mgr: ud-01.qycnol(active, since 8d), standbys: ud-02.tfhqfd
>> >     mds: 1/1 daemons up, 4 standby
>> >     osd: 80 osds: 80 up (since 9d), 80 in (since 5M)
>> >
>> >   data:
>> >     volumes: 1/1 healthy
>> >     pools:   3 pools, 2305 pgs
>> >     objects: 106.58M objects, 25 TiB
>> >     usage:   45 TiB used, 101 TiB / 146 TiB avail
>> >     pgs:     2303 active+clean
>> >              2    active+clean+scrubbing+deep
>> >
>> >   io:
>> >     client:   16 MiB/s rd, 3.4 MiB/s wr, 77 op/s rd, 23 op/s wr
>> >
>> > ------------------------------
>> > root@ud-01:~# ceph fs status
>> > ud-data - 84 clients
>> > =======
>> > RANK  STATE           MDS              ACTIVITY     DNS    INOS   DIRS
>> > CAPS
>> >  0    active  ud-data.ud-02.xcoojt  Reqs:   40 /s  2579k  2578k   169k
>> >  3048k
>> >         POOL           TYPE     USED  AVAIL
>> > cephfs.ud-data.meta  metadata   136G  44.9T
>> > cephfs.ud-data.data    data    44.3T  44.9T
>> >
>> > ------------------------------
>> > root@ud-01:~# ceph health detail
>> > HEALTH_WARN 1 clients failing to respond to cache pressure
>> > [WRN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure
>> >     mds.ud-data.ud-02.xcoojt(mds.0): Client bmw-m4 failing to respond to
>> > cache pressure client_id: 1275577
>> >
>> > ------------------------------
>> > When I check the failing client with session ls I see only "num_caps:
>> 12298"
>> >
>> > ceph tell mds.ud-data.ud-02.xcoojt session ls | jq -r '.[] | "clientid:
>> > \(.id)= num_caps: \(.num_caps), num_leases: \(.num_leases),
>> > request_load_avg: \(.request_load_avg), num_completed_requests:
>> > \(.num_completed_requests), num_completed_flushes:
>> > \(.num_completed_flushes)"' | sort -n -t: -k3
>> >
>> > clientid: 1275577= num_caps: 12298, num_leases: 0, request_load_avg: 0,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> > clientid: 1294542= num_caps: 13000, num_leases: 12, request_load_avg:
>> 105,
>> > num_completed_requests: 0, num_completed_flushes: 6
>> > clientid: 1282187= num_caps: 16869, num_leases: 1, request_load_avg: 0,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> > clientid: 1275589= num_caps: 18943, num_leases: 0, request_load_avg: 52,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> > clientid: 1282154= num_caps: 24747, num_leases: 1, request_load_avg: 57,
>> > num_completed_requests: 2, num_completed_flushes: 2
>> > clientid: 1275553= num_caps: 25120, num_leases: 2, request_load_avg:
>> 116,
>> > num_completed_requests: 2, num_completed_flushes: 8
>> > clientid: 1282142= num_caps: 27185, num_leases: 6, request_load_avg:
>> 128,
>> > num_completed_requests: 0, num_completed_flushes: 8
>> > clientid: 1275535= num_caps: 40364, num_leases: 6, request_load_avg:
>> 111,
>> > num_completed_requests: 2, num_completed_flushes: 8
>> > clientid: 1282130= num_caps: 41483, num_leases: 0, request_load_avg:
>> 135,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> > clientid: 1275547= num_caps: 42953, num_leases: 4, request_load_avg:
>> 119,
>> > num_completed_requests: 2, num_completed_flushes: 6
>> > clientid: 1282139= num_caps: 45435, num_leases: 27, request_load_avg:
>> 84,
>> > num_completed_requests: 2, num_completed_flushes: 34
>> > clientid: 1282136= num_caps: 48374, num_leases: 8, request_load_avg: 0,
>> > num_completed_requests: 1, num_completed_flushes: 1
>> > clientid: 1275532= num_caps: 48664, num_leases: 7, request_load_avg:
>> 115,
>> > num_completed_requests: 2, num_completed_flushes: 8
>> > clientid: 1191789= num_caps: 130319, num_leases: 0, request_load_avg:
>> 1753,
>> > num_completed_requests: 0, num_completed_flushes: 0
>> > clientid: 1275571= num_caps: 139488, num_leases: 0, request_load_avg: 2,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> > clientid: 1282133= num_caps: 145487, num_leases: 0, request_load_avg: 8,
>> > num_completed_requests: 1, num_completed_flushes: 1
>> > clientid: 1534496= num_caps: 1041316, num_leases: 0, request_load_avg:
>> 0,
>> > num_completed_requests: 0, num_completed_flushes: 1
>> >
>> > ------------------------------
>> > When I check the dashboard/service/mds I see %120+ CPU usage on active
>> MDS
>> > but on the host everything is almost idle and disk waits are very low.
>> >
>> > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>> >            0.61    0.00    0.38    0.41    0.00   98.60
>> >
>> > Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s
>> >   wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s
>> %drqm
>> > d_await dareq-sz     f/s f_await  aqu-sz  %util
>> > sdc              2.00      0.01     0.00   0.00    0.50     6.00   20.00
>> >    0.04     0.00   0.00    0.50     2.00    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   10.00    0.60    0.02   1.20
>> > sdd              3.00      0.02     0.00   0.00    0.67     8.00  285.00
>> >    1.84    77.00  21.27    0.44     6.61    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00  114.00    0.83    0.22  22.40
>> > sde              1.00      0.01     0.00   0.00    1.00     8.00   36.00
>> >    0.08     3.00   7.69    0.64     2.33    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   18.00    0.67    0.04   1.60
>> > sdf              5.00      0.04     0.00   0.00    0.40     7.20   40.00
>> >    0.09     3.00   6.98    0.53     2.30    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   20.00    0.70    0.04   2.00
>> > sdg             11.00      0.08     0.00   0.00    0.73     7.27   36.00
>> >    0.09     4.00  10.00    0.50     2.44    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   18.00    0.72    0.04   3.20
>> > sdh              5.00      0.03     0.00   0.00    0.60     5.60   46.00
>> >    0.10     2.00   4.17    0.59     2.17    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   23.00    0.83    0.05   2.80
>> > sdi              7.00      0.04     0.00   0.00    0.43     6.29   36.00
>> >    0.07     1.00   2.70    0.47     2.11    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   18.00    0.61    0.03   2.40
>> > sdj              5.00      0.04     0.00   0.00    0.80     7.20   42.00
>> >    0.09     1.00   2.33    0.67     2.10    0.00      0.00     0.00
>>  0.00
>> >    0.00     0.00   21.00    0.81    0.05   3.20
>> >
>> > ------------------------------
>> > Other than this 5x node cluster, I also have a 3x node cluster with
>> > identical hardware but it serves for a different purpose and data
>> workload.
>> > In this cluster I don't have any problem and MDS default settings seems
>> > enough.
>> > The only difference between two cluster is, 5x node cluster used
>> directly
>> > by users, 3x node cluster used heavily to read and write data via
>> projects
>> > not by users. So allocate and de-allocate will be better.
>> >
>> > I guess I just have a problematic use case on the 5x node cluster and
>> as I
>> > mentioned above, I might have the similar problem but I don't know how
>> to
>> > debug it.
>> >
>> >
>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/
>> > quote:"A user running VSCodium, keeping 15k caps open.. the
>> opportunistic
>> > caps recall eventually starts recalling those but the (el7 kernel)
>> client
>> > won't release them. Stopping Codium seems to be the only way to
>> release."
>> >
>> > ------------------------------
>> > Before reading the osd df you should know that I created 2x
>> > OSD/per"CT4000MX500SSD1"
>> > # ceph osd df tree
>> > ID   CLASS  WEIGHT     REWEIGHT  SIZE     RAW USE  DATA     OMAP
>> META
>> >     AVAIL    %USE   VAR   PGS  STATUS  TYPE NAME
>> >  -1         145.54321         -  146 TiB   45 TiB   44 TiB   119 GiB
>> 333
>> > GiB  101 TiB  30.81  1.00    -          root default
>> >  -3          29.10864         -   29 TiB  8.9 TiB  8.8 TiB    25 GiB
>>  66
>> > GiB   20 TiB  30.54  0.99    -              host ud-01
>> >   0    ssd    1.81929   1.00000  1.8 TiB  616 GiB  610 GiB   1.4 GiB
>> 4.5
>> > GiB  1.2 TiB  33.04  1.07   61      up          osd.0
>> >   1    ssd    1.81929   1.00000  1.8 TiB  527 GiB  521 GiB   1.5 GiB
>> 4.0
>> > GiB  1.3 TiB  28.28  0.92   53      up          osd.1
>> >   2    ssd    1.81929   1.00000  1.8 TiB  595 GiB  589 GiB   2.3 GiB
>> 4.0
>> > GiB  1.2 TiB  31.96  1.04   63      up          osd.2
>> >   3    ssd    1.81929   1.00000  1.8 TiB  527 GiB  521 GiB   1.8 GiB
>> 4.2
>> > GiB  1.3 TiB  28.30  0.92   55      up          osd.3
>> >   4    ssd    1.81929   1.00000  1.8 TiB  525 GiB  520 GiB   1.3 GiB
>> 3.9
>> > GiB  1.3 TiB  28.21  0.92   52      up          osd.4
>> >   5    ssd    1.81929   1.00000  1.8 TiB  592 GiB  586 GiB   1.8 GiB
>> 3.8
>> > GiB  1.2 TiB  31.76  1.03   61      up          osd.5
>> >   6    ssd    1.81929   1.00000  1.8 TiB  559 GiB  553 GiB   1.8 GiB
>> 4.3
>> > GiB  1.3 TiB  30.03  0.97   57      up          osd.6
>> >   7    ssd    1.81929   1.00000  1.8 TiB  602 GiB  597 GiB   836 MiB
>> 4.4
>> > GiB  1.2 TiB  32.32  1.05   58      up          osd.7
>> >   8    ssd    1.81929   1.00000  1.8 TiB  614 GiB  609 GiB   1.2 GiB
>> 4.5
>> > GiB  1.2 TiB  32.98  1.07   60      up          osd.8
>> >   9    ssd    1.81929   1.00000  1.8 TiB  571 GiB  565 GiB   2.2 GiB
>> 4.2
>> > GiB  1.3 TiB  30.67  1.00   61      up          osd.9
>> >  10    ssd    1.81929   1.00000  1.8 TiB  528 GiB  522 GiB   1.3 GiB
>> 4.1
>> > GiB  1.3 TiB  28.33  0.92   52      up          osd.10
>> >  11    ssd    1.81929   1.00000  1.8 TiB  551 GiB  546 GiB   1.5 GiB
>> 3.6
>> > GiB  1.3 TiB  29.57  0.96   56      up          osd.11
>> >  12    ssd    1.81929   1.00000  1.8 TiB  594 GiB  588 GiB   1.8 GiB
>> 4.4
>> > GiB  1.2 TiB  31.91  1.04   61      up          osd.12
>> >  13    ssd    1.81929   1.00000  1.8 TiB  561 GiB  555 GiB   1.1 GiB
>> 4.3
>> > GiB  1.3 TiB  30.10  0.98   55      up          osd.13
>> >  14    ssd    1.81929   1.00000  1.8 TiB  616 GiB  609 GiB   1.9 GiB
>> 4.2
>> > GiB  1.2 TiB  33.04  1.07   64      up          osd.14
>> >  15    ssd    1.81929   1.00000  1.8 TiB  525 GiB  520 GiB   1.1 GiB
>> 4.0
>> > GiB  1.3 TiB  28.20  0.92   51      up          osd.15
>> >  -5          29.10864         -   29 TiB  9.0 TiB  8.9 TiB    22 GiB
>>  67
>> > GiB   20 TiB  30.89  1.00    -              host ud-02
>> >  16    ssd    1.81929   1.00000  1.8 TiB  617 GiB  611 GiB   1.7 GiB
>> 4.7
>> > GiB  1.2 TiB  33.12  1.08   63      up          osd.16
>> >  17    ssd    1.81929   1.00000  1.8 TiB  582 GiB  577 GiB   1.6 GiB
>> 4.0
>> > GiB  1.3 TiB  31.26  1.01   59      up          osd.17
>> >  18    ssd    1.81929   1.00000  1.8 TiB  583 GiB  578 GiB   418 MiB
>> 4.0
>> > GiB  1.3 TiB  31.29  1.02   54      up          osd.18
>> >  19    ssd    1.81929   1.00000  1.8 TiB  550 GiB  544 GiB   1.5 GiB
>> 4.0
>> > GiB  1.3 TiB  29.50  0.96   56      up          osd.19
>> >  20    ssd    1.81929   1.00000  1.8 TiB  551 GiB  546 GiB   1.1 GiB
>> 4.1
>> > GiB  1.3 TiB  29.57  0.96   54      up          osd.20
>> >  21    ssd    1.81929   1.00000  1.8 TiB  616 GiB  610 GiB   1.3 GiB
>> 4.4
>> > GiB  1.2 TiB  33.04  1.07   60      up          osd.21
>> >  22    ssd    1.81929   1.00000  1.8 TiB  573 GiB  567 GiB   1.6 GiB
>> 4.1
>> > GiB  1.3 TiB  30.75  1.00   58      up          osd.22
>> >  23    ssd    1.81929   1.00000  1.8 TiB  616 GiB  610 GiB   1.3 GiB
>> 4.3
>> > GiB  1.2 TiB  33.06  1.07   60      up          osd.23
>> >  24    ssd    1.81929   1.00000  1.8 TiB  539 GiB  534 GiB   844 MiB
>> 3.8
>> > GiB  1.3 TiB  28.92  0.94   51      up          osd.24
>> >  25    ssd    1.81929   1.00000  1.8 TiB  583 GiB  576 GiB   2.1 GiB
>> 4.1
>> > GiB  1.3 TiB  31.27  1.02   61      up          osd.25
>> >  26    ssd    1.81929   1.00000  1.8 TiB  617 GiB  611 GiB   1.3 GiB
>> 4.6
>> > GiB  1.2 TiB  33.12  1.08   61      up          osd.26
>> >  27    ssd    1.81929   1.00000  1.8 TiB  537 GiB  532 GiB   1.2 GiB
>> 4.1
>> > GiB  1.3 TiB  28.84  0.94   53      up          osd.27
>> >  28    ssd    1.81929   1.00000  1.8 TiB  527 GiB  522 GiB   1.3 GiB
>> 4.2
>> > GiB  1.3 TiB  28.29  0.92   53      up          osd.28
>> >  29    ssd    1.81929   1.00000  1.8 TiB  594 GiB  588 GiB   1.5 GiB
>> 4.6
>> > GiB  1.2 TiB  31.91  1.04   59      up          osd.29
>> >  30    ssd    1.81929   1.00000  1.8 TiB  528 GiB  523 GiB   1.4 GiB
>> 4.1
>> > GiB  1.3 TiB  28.35  0.92   53      up          osd.30
>> >  31    ssd    1.81929   1.00000  1.8 TiB  594 GiB  589 GiB   1.6 GiB
>> 3.8
>> > GiB  1.2 TiB  31.89  1.03   61      up          osd.31
>> >  -7          29.10864         -   29 TiB  8.9 TiB  8.8 TiB    23 GiB
>>  67
>> > GiB   20 TiB  30.66  1.00    -              host ud-03
>> >  32    ssd    1.81929   1.00000  1.8 TiB  593 GiB  588 GiB   1.1 GiB
>> 4.3
>> > GiB  1.2 TiB  31.84  1.03   57      up          osd.32
>> >  33    ssd    1.81929   1.00000  1.8 TiB  617 GiB  611 GiB   1.8 GiB
>> 4.4
>> > GiB  1.2 TiB  33.13  1.08   63      up          osd.33
>> >  34    ssd    1.81929   1.00000  1.8 TiB  537 GiB  532 GiB   2.0 GiB
>> 3.8
>> > GiB  1.3 TiB  28.84  0.94   59      up          osd.34
>> >  35    ssd    1.81929   1.00000  1.8 TiB  562 GiB  556 GiB   1.7 GiB
>> 4.2
>> > GiB  1.3 TiB  30.16  0.98   58      up          osd.35
>> >  36    ssd    1.81929   1.00000  1.8 TiB  529 GiB  523 GiB   1.3 GiB
>> 3.9
>> > GiB  1.3 TiB  28.38  0.92   52      up          osd.36
>> >  37    ssd    1.81929   1.00000  1.8 TiB  527 GiB  521 GiB   1.7 GiB
>> 4.2
>> > GiB  1.3 TiB  28.28  0.92   55      up          osd.37
>> >  38    ssd    1.81929   1.00000  1.8 TiB  574 GiB  568 GiB   1.2 GiB
>> 4.3
>> > GiB  1.3 TiB  30.79  1.00   55      up          osd.38
>> >  39    ssd    1.81929   1.00000  1.8 TiB  605 GiB  599 GiB   1.6 GiB
>> 4.2
>> > GiB  1.2 TiB  32.48  1.05   61      up          osd.39
>> >  40    ssd    1.81929   1.00000  1.8 TiB  573 GiB  567 GiB   1.2 GiB
>> 4.4
>> > GiB  1.3 TiB  30.76  1.00   56      up          osd.40
>> >  41    ssd    1.81929   1.00000  1.8 TiB  526 GiB  520 GiB   1.7 GiB
>> 3.9
>> > GiB  1.3 TiB  28.21  0.92   54      up          osd.41
>> >  42    ssd    1.81929   1.00000  1.8 TiB  613 GiB  608 GiB  1010 MiB
>> 4.4
>> > GiB  1.2 TiB  32.91  1.07   58      up          osd.42
>> >  43    ssd    1.81929   1.00000  1.8 TiB  606 GiB  600 GiB   1.7 GiB
>> 4.3
>> > GiB  1.2 TiB  32.51  1.06   61      up          osd.43
>> >  44    ssd    1.81929   1.00000  1.8 TiB  583 GiB  577 GiB   1.6 GiB
>> 4.2
>> > GiB  1.3 TiB  31.29  1.02   60      up          osd.44
>> >  45    ssd    1.81929   1.00000  1.8 TiB  618 GiB  613 GiB   1.4 GiB
>> 4.3
>> > GiB  1.2 TiB  33.18  1.08   62      up          osd.45
>> >  46    ssd    1.81929   1.00000  1.8 TiB  550 GiB  544 GiB   1.5 GiB
>> 4.2
>> > GiB  1.3 TiB  29.50  0.96   54      up          osd.46
>> >  47    ssd    1.81929   1.00000  1.8 TiB  526 GiB  522 GiB   692 MiB
>> 3.7
>> > GiB  1.3 TiB  28.25  0.92   50      up          osd.47
>> >  -9          29.10864         -   29 TiB  9.0 TiB  8.9 TiB    26 GiB
>>  68
>> > GiB   20 TiB  31.04  1.01    -              host ud-04
>> >  48    ssd    1.81929   1.00000  1.8 TiB  540 GiB  534 GiB   2.2 GiB
>> 3.6
>> > GiB  1.3 TiB  28.96  0.94   58      up          osd.48
>> >  49    ssd    1.81929   1.00000  1.8 TiB  617 GiB  611 GiB   1.4 GiB
>> 4.5
>> > GiB  1.2 TiB  33.11  1.07   61      up          osd.49
>> >  50    ssd    1.81929   1.00000  1.8 TiB  618 GiB  612 GiB   1.2 GiB
>> 4.8
>> > GiB  1.2 TiB  33.17  1.08   61      up          osd.50
>> >  51    ssd    1.81929   1.00000  1.8 TiB  618 GiB  612 GiB   1.5 GiB
>> 4.5
>> > GiB  1.2 TiB  33.19  1.08   61      up          osd.51
>> >  52    ssd    1.81929   1.00000  1.8 TiB  526 GiB  521 GiB   1.4 GiB
>> 4.1
>> > GiB  1.3 TiB  28.25  0.92   53      up          osd.52
>> >  53    ssd    1.81929   1.00000  1.8 TiB  618 GiB  611 GiB   2.4 GiB
>> 4.3
>> > GiB  1.2 TiB  33.17  1.08   66      up          osd.53
>> >  54    ssd    1.81929   1.00000  1.8 TiB  550 GiB  544 GiB   1.5 GiB
>> 4.3
>> > GiB  1.3 TiB  29.54  0.96   55      up          osd.54
>> >  55    ssd    1.81929   1.00000  1.8 TiB  527 GiB  522 GiB   1.3 GiB
>> 4.0
>> > GiB  1.3 TiB  28.29  0.92   52      up          osd.55
>> >  56    ssd    1.81929   1.00000  1.8 TiB  525 GiB  519 GiB   1.2 GiB
>> 4.1
>> > GiB  1.3 TiB  28.16  0.91   52      up          osd.56
>> >  57    ssd    1.81929   1.00000  1.8 TiB  615 GiB  609 GiB   2.3 GiB
>> 4.2
>> > GiB  1.2 TiB  33.03  1.07   65      up          osd.57
>> >  58    ssd    1.81929   1.00000  1.8 TiB  527 GiB  522 GiB   1.6 GiB
>> 3.7
>> > GiB  1.3 TiB  28.31  0.92   55      up          osd.58
>> >  59    ssd    1.81929   1.00000  1.8 TiB  615 GiB  609 GiB   1.2 GiB
>> 4.6
>> > GiB  1.2 TiB  33.01  1.07   60      up          osd.59
>> >  60    ssd    1.81929   1.00000  1.8 TiB  594 GiB  588 GiB   1.2 GiB
>> 4.4
>> > GiB  1.2 TiB  31.88  1.03   59      up          osd.60
>> >  61    ssd    1.81929   1.00000  1.8 TiB  616 GiB  610 GiB   1.9 GiB
>> 4.1
>> > GiB  1.2 TiB  33.04  1.07   64      up          osd.61
>> >  62    ssd    1.81929   1.00000  1.8 TiB  620 GiB  614 GiB   1.9 GiB
>> 4.4
>> > GiB  1.2 TiB  33.27  1.08   63      up          osd.62
>> >  63    ssd    1.81929   1.00000  1.8 TiB  527 GiB  522 GiB   1.5 GiB
>> 4.0
>> > GiB  1.3 TiB  28.30  0.92   53      up          osd.63
>> > -11          29.10864         -   29 TiB  9.0 TiB  8.9 TiB    23 GiB
>>  65
>> > GiB   20 TiB  30.91  1.00    -              host ud-05
>> >  64    ssd    1.81929   1.00000  1.8 TiB  608 GiB  601 GiB   2.3 GiB
>> 4.5
>> > GiB  1.2 TiB  32.62  1.06   65      up          osd.64
>> >  65    ssd    1.81929   1.00000  1.8 TiB  606 GiB  601 GiB   628 MiB
>> 4.2
>> > GiB  1.2 TiB  32.53  1.06   57      up          osd.65
>> >  66    ssd    1.81929   1.00000  1.8 TiB  583 GiB  578 GiB   1.3 GiB
>> 4.3
>> > GiB  1.2 TiB  31.31  1.02   57      up          osd.66
>> >  67    ssd    1.81929   1.00000  1.8 TiB  537 GiB  533 GiB   436 MiB
>> 3.6
>> > GiB  1.3 TiB  28.82  0.94   50      up          osd.67
>> >  68    ssd    1.81929   1.00000  1.8 TiB  541 GiB  535 GiB   2.5 GiB
>> 3.8
>> > GiB  1.3 TiB  29.04  0.94   59      up          osd.68
>> >  69    ssd    1.81929   1.00000  1.8 TiB  606 GiB  601 GiB   1.1 GiB
>> 4.4
>> > GiB  1.2 TiB  32.55  1.06   59      up          osd.69
>> >  70    ssd    1.81929   1.00000  1.8 TiB  604 GiB  598 GiB   1.8 GiB
>> 4.1
>> > GiB  1.2 TiB  32.44  1.05   63      up          osd.70
>> >  71    ssd    1.81929   1.00000  1.8 TiB  606 GiB  600 GiB   1.9 GiB
>> 4.5
>> > GiB  1.2 TiB  32.53  1.06   62      up          osd.71
>> >  72    ssd    1.81929   1.00000  1.8 TiB  602 GiB  598 GiB   612 MiB
>> 4.1
>> > GiB  1.2 TiB  32.33  1.05   57      up          osd.72
>> >  73    ssd    1.81929   1.00000  1.8 TiB  571 GiB  565 GiB   1.8 GiB
>> 4.5
>> > GiB  1.3 TiB  30.65  0.99   58      up          osd.73
>> >  74    ssd    1.81929   1.00000  1.8 TiB  608 GiB  602 GiB   1.8 GiB
>> 4.2
>> > GiB  1.2 TiB  32.62  1.06   61      up          osd.74
>> >  75    ssd    1.81929   1.00000  1.8 TiB  536 GiB  531 GiB   1.9 GiB
>> 3.5
>> > GiB  1.3 TiB  28.80  0.93   57      up          osd.75
>> >  76    ssd    1.81929   1.00000  1.8 TiB  605 GiB  599 GiB   1.4 GiB
>> 4.5
>> > GiB  1.2 TiB  32.48  1.05   60      up          osd.76
>> >  77    ssd    1.81929   1.00000  1.8 TiB  537 GiB  532 GiB   1.2 GiB
>> 3.9
>> > GiB  1.3 TiB  28.84  0.94   52      up          osd.77
>> >  78    ssd    1.81929   1.00000  1.8 TiB  525 GiB  520 GiB   1.3 GiB
>> 3.8
>> > GiB  1.3 TiB  28.20  0.92   52      up          osd.78
>> >  79    ssd    1.81929   1.00000  1.8 TiB  536 GiB  531 GiB   1.1 GiB
>> 3.3
>> > GiB  1.3 TiB  28.76  0.93   53      up          osd.79
>> >                           TOTAL  146 TiB   45 TiB   44 TiB   119 GiB
>> 333
>> > GiB  101 TiB  30.81
>> > MIN/MAX VAR: 0.91/1.08  STDDEV: 1.90
>> >
>> >
>> >
>> > Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 16:52 tarihinde şunu
>> yazdı:
>> >
>> >> There is no definitive answer wrt mds tuning. As it is everywhere
>> >> mentioned, it's about finding the right setup for your specific
>> >> workload. If you can synthesize your workload (maybe scale down a bit)
>> >> try optimizing it in a test cluster without interrupting your
>> >> developers too much.
>> >> But what you haven't explained yet is what are you experiencing as a
>> >> performance issue? Do you have numbers or a detailed description?
>> >>  From the fs status output you didn't seem to have too much activity
>> >> going on (around 140 requests per second), but that's probably not the
>> >> usual traffic? What does ceph report in its client IO output?
>> >> Can you paste the 'ceph osd df' output as well?
>> >> Do you have dedicated MDS servers or are they colocated with other
>> >> services?
>> >>
>> >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>:
>> >>
>> >> > Hello  Eugen.
>> >> >
>> >> > I read all of your MDS related topics and thank you so much for your
>> >> effort
>> >> > on this.
>> >> > There is not much information and I couldn't find a MDS tuning guide
>> at
>> >> > all. It  seems that you are the correct person to discuss mds
>> debugging
>> >> and
>> >> > tuning.
>> >> >
>> >> > Do you have any documents or may I learn what is the proper way to
>> debug
>> >> > MDS and clients ?
>> >> > Which debug logs will guide me to understand the limitations and will
>> >> help
>> >> > to tune according to the data flow?
>> >> >
>> >> > While searching, I find this:
>> >> >
>> >>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/
>> >> > quote:"A user running VSCodium, keeping 15k caps open.. the
>> opportunistic
>> >> > caps recall eventually starts recalling those but the (el7 kernel)
>> client
>> >> > won't release them. Stopping Codium seems to be the only way to
>> release."
>> >> >
>> >> > Because of this I think I also need to play around with the client
>> side
>> >> too.
>> >> >
>> >> > My main goal is increasing the speed and reducing the latency and I
>> >> wonder
>> >> > if these ideas are correct or not:
>> >> > - Maybe I need to increase client side cache size because via each
>> >> client,
>> >> > multiple users request a lot of objects and clearly the
>> >> > client_cache_size=16 default is not enough.
>> >> > -  Maybe I need to increase client side maximum cache limit for
>> >> > object "client_oc_max_objects=1000 to 10000" and data
>> >> "client_oc_size=200mi
>> >> > to 400mi"
>> >> > - The client cache cleaning threshold is not aggressive enough to
>> keep
>> >> the
>> >> > free cache size in the desired range. I need to make it aggressive
>> but
>> >> this
>> >> > should not reduce speed and increase latency.
>> >> >
>> >> > mds_cache_memory_limit=4gi to 16gi
>> >> > client_oc_max_objects=1000 to 10000
>> >> > client_oc_size=200mi to 400mi
>> >> > client_permissions=false #to reduce latency.
>> >> > client_cache_size=16 to 128
>> >> >
>> >> >
>> >> > What do you think?
>> >>
>> >>
>> >>
>> >>
>>
>>
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx