Hello Frank. I have 84 clients (high-end servers) with: Ubuntu 20.04.5 LTS - Kernel: Linux 5.4.0-125-generic My cluster 17.2.6 quincy. I have some client nodes with "ceph-common/stable,now 17.2.7-1focal" I wonder using new version clients is the main problem? Maybe I have a communication error. For example I hit this problem and I can not collect client stats "https://github.com/ceph/ceph/pull/52127/files" Best regards. Frank Schilder <frans@xxxxxx>, 26 Oca 2024 Cum, 14:53 tarihinde şunu yazdı: > Hi, this message is one of those that are often spurious. I don't recall > in which thread/PR/tracker I read it, but the story was something like that: > > If an MDS gets under memory pressure it will request dentry items back > from *all* clients, not just the active ones or the ones holding many of > them. If you have a client that's below the min-threshold for dentries (its > one of the client/mds tuning options), it will not respond. This client > will be flagged as not responding, which is a false positive. > > I believe the devs are working on a fix to get rid of these spurious > warnings. There is a "bug/feature" in the MDS that does not clear this > warning flag for inactive clients. Hence, the message hangs and never > disappears. I usually clear it with a "echo 3 > /proc/sys/vm/drop_caches" > on the client. However, except for being annoying in the dashboard, it has > no performance or otherwise negative impact. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Eugen Block <eblock@xxxxxx> > Sent: Friday, January 26, 2024 10:05 AM > To: Özkan Göksu > Cc: ceph-users@xxxxxxx > Subject: Re: 1 clients failing to respond to cache pressure > (quincy:17.2.6) > > Performance for small files is more about IOPS rather than throughput, > and the IOPS in your fio tests look okay to me. What you could try is > to split the PGs to get around 150 or 200 PGs per OSD. You're > currently at around 60 according to the ceph osd df output. Before you > do that, can you share 'ceph pg ls-by-pool cephfs.ud-data.data | > head'? I don't need the whole output, just to see how many objects > each PG has. We had a case once where that helped, but it was an older > cluster and the pool was backed by HDDs and separate rocksDB on SSDs. > So this might not be the solution here, but it could improve things as > well. > > > Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: > > > Every user has a 1x subvolume and I only have 1 pool. > > At the beginning we were using each subvolume for ldap home directory + > > user data. > > When a user logins any docker on any host, it was using the cluster for > > home and the for user related data, we was have second directory in the > > same subvolume. > > Time to time users were feeling a very slow home environment and after a > > month it became almost impossible to use home. VNC sessions became > > unresponsive and slow etc. > > > > 2 weeks ago, I had to migrate home to a ZFS storage and now the overall > > performance is better for only user_data without home. > > But still the performance is not good enough as I expected because of the > > problems related to MDS. > > The usage is low but allocation is high and Cpu usage is high. You saw > the > > IO Op/s, it's nothing but allocation is high. > > > > I develop a fio benchmark script and I run the script on 4x test server > at > > the same time, the results are below: > > Script: > > > https://github.com/ozkangoksu/benchmark/blob/8f5df87997864c25ef32447e02fcd41fda0d2a67/iobench.sh > > > > > https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-01.txt > > > https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-02.txt > > > https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-03.txt > > > https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-04.txt > > > > While running benchmark, I take sample values for each type of iobench > run. > > > > Seq Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 > > client: 70 MiB/s rd, 762 MiB/s wr, 337 op/s rd, 24.41k op/s wr > > client: 60 MiB/s rd, 551 MiB/s wr, 303 op/s rd, 35.12k op/s wr > > client: 13 MiB/s rd, 161 MiB/s wr, 101 op/s rd, 41.30k op/s wr > > > > Seq Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 > > client: 1.6 GiB/s rd, 219 KiB/s wr, 28.76k op/s rd, 89 op/s wr > > client: 370 MiB/s rd, 475 KiB/s wr, 90.38k op/s rd, 89 op/s wr > > > > Rand Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 > > client: 63 MiB/s rd, 1.5 GiB/s wr, 8.77k op/s rd, 5.50k op/s wr > > client: 14 MiB/s rd, 1.8 GiB/s wr, 81 op/s rd, 13.86k op/s wr > > client: 6.6 MiB/s rd, 1.2 GiB/s wr, 61 op/s rd, 30.13k op/s wr > > > > Rand Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 > > client: 317 MiB/s rd, 841 MiB/s wr, 426 op/s rd, 10.98k op/s wr > > client: 2.8 GiB/s rd, 882 MiB/s wr, 25.68k op/s rd, 291 op/s wr > > client: 4.0 GiB/s rd, 226 MiB/s wr, 89.63k op/s rd, 124 op/s wr > > client: 2.4 GiB/s rd, 295 KiB/s wr, 197.86k op/s rd, 20 op/s wr > > > > It seems I only have problems with the 4K,8K,16K other sector sizes. > > > > > > > > > > Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 19:06 tarihinde şunu > yazdı: > > > >> I understand that your MDS shows a high CPU usage, but other than that > >> what is your performance issue? Do users complain? Do some operations > >> take longer than expected? Are OSDs saturated during those phases? > >> Because the cache pressure messages don’t necessarily mean that users > >> will notice. > >> MDS daemons are single-threaded so that might be a bottleneck. In that > >> case multi-active mds might help, which you already tried and > >> experienced OOM killers. But you might have to disable the mds > >> balancer as someone else mentioned. And then you could think about > >> pinning, is it possible to split the CephFS into multiple > >> subdirectories and pin them to different ranks? > >> But first I’d still like to know what the performance issue really is. > >> > >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: > >> > >> > I will try my best to explain my situation. > >> > > >> > I don't have a separate mds server. I have 5 identical nodes, 3 of > them > >> > mons, and I use the other 2 as active and standby mds. (currently I > have > >> > left overs from max_mds 4) > >> > > >> > root@ud-01:~# ceph -s > >> > cluster: > >> > id: e42fd4b0-313b-11ee-9a00-31da71873773 > >> > health: HEALTH_WARN > >> > 1 clients failing to respond to cache pressure > >> > > >> > services: > >> > mon: 3 daemons, quorum ud-01,ud-02,ud-03 (age 9d) > >> > mgr: ud-01.qycnol(active, since 8d), standbys: ud-02.tfhqfd > >> > mds: 1/1 daemons up, 4 standby > >> > osd: 80 osds: 80 up (since 9d), 80 in (since 5M) > >> > > >> > data: > >> > volumes: 1/1 healthy > >> > pools: 3 pools, 2305 pgs > >> > objects: 106.58M objects, 25 TiB > >> > usage: 45 TiB used, 101 TiB / 146 TiB avail > >> > pgs: 2303 active+clean > >> > 2 active+clean+scrubbing+deep > >> > > >> > io: > >> > client: 16 MiB/s rd, 3.4 MiB/s wr, 77 op/s rd, 23 op/s wr > >> > > >> > ------------------------------ > >> > root@ud-01:~# ceph fs status > >> > ud-data - 84 clients > >> > ======= > >> > RANK STATE MDS ACTIVITY DNS INOS DIRS > >> > CAPS > >> > 0 active ud-data.ud-02.xcoojt Reqs: 40 /s 2579k 2578k 169k > >> > 3048k > >> > POOL TYPE USED AVAIL > >> > cephfs.ud-data.meta metadata 136G 44.9T > >> > cephfs.ud-data.data data 44.3T 44.9T > >> > > >> > ------------------------------ > >> > root@ud-01:~# ceph health detail > >> > HEALTH_WARN 1 clients failing to respond to cache pressure > >> > [WRN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache > pressure > >> > mds.ud-data.ud-02.xcoojt(mds.0): Client bmw-m4 failing to respond > to > >> > cache pressure client_id: 1275577 > >> > > >> > ------------------------------ > >> > When I check the failing client with session ls I see only "num_caps: > >> 12298" > >> > > >> > ceph tell mds.ud-data.ud-02.xcoojt session ls | jq -r '.[] | > "clientid: > >> > \(.id)= num_caps: \(.num_caps), num_leases: \(.num_leases), > >> > request_load_avg: \(.request_load_avg), num_completed_requests: > >> > \(.num_completed_requests), num_completed_flushes: > >> > \(.num_completed_flushes)"' | sort -n -t: -k3 > >> > > >> > clientid: 1275577= num_caps: 12298, num_leases: 0, request_load_avg: > 0, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > clientid: 1294542= num_caps: 13000, num_leases: 12, request_load_avg: > >> 105, > >> > num_completed_requests: 0, num_completed_flushes: 6 > >> > clientid: 1282187= num_caps: 16869, num_leases: 1, request_load_avg: > 0, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > clientid: 1275589= num_caps: 18943, num_leases: 0, request_load_avg: > 52, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > clientid: 1282154= num_caps: 24747, num_leases: 1, request_load_avg: > 57, > >> > num_completed_requests: 2, num_completed_flushes: 2 > >> > clientid: 1275553= num_caps: 25120, num_leases: 2, request_load_avg: > 116, > >> > num_completed_requests: 2, num_completed_flushes: 8 > >> > clientid: 1282142= num_caps: 27185, num_leases: 6, request_load_avg: > 128, > >> > num_completed_requests: 0, num_completed_flushes: 8 > >> > clientid: 1275535= num_caps: 40364, num_leases: 6, request_load_avg: > 111, > >> > num_completed_requests: 2, num_completed_flushes: 8 > >> > clientid: 1282130= num_caps: 41483, num_leases: 0, request_load_avg: > 135, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > clientid: 1275547= num_caps: 42953, num_leases: 4, request_load_avg: > 119, > >> > num_completed_requests: 2, num_completed_flushes: 6 > >> > clientid: 1282139= num_caps: 45435, num_leases: 27, request_load_avg: > 84, > >> > num_completed_requests: 2, num_completed_flushes: 34 > >> > clientid: 1282136= num_caps: 48374, num_leases: 8, request_load_avg: > 0, > >> > num_completed_requests: 1, num_completed_flushes: 1 > >> > clientid: 1275532= num_caps: 48664, num_leases: 7, request_load_avg: > 115, > >> > num_completed_requests: 2, num_completed_flushes: 8 > >> > clientid: 1191789= num_caps: 130319, num_leases: 0, request_load_avg: > >> 1753, > >> > num_completed_requests: 0, num_completed_flushes: 0 > >> > clientid: 1275571= num_caps: 139488, num_leases: 0, request_load_avg: > 2, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > clientid: 1282133= num_caps: 145487, num_leases: 0, request_load_avg: > 8, > >> > num_completed_requests: 1, num_completed_flushes: 1 > >> > clientid: 1534496= num_caps: 1041316, num_leases: 0, > request_load_avg: 0, > >> > num_completed_requests: 0, num_completed_flushes: 1 > >> > > >> > ------------------------------ > >> > When I check the dashboard/service/mds I see %120+ CPU usage on active > >> MDS > >> > but on the host everything is almost idle and disk waits are very low. > >> > > >> > avg-cpu: %user %nice %system %iowait %steal %idle > >> > 0.61 0.00 0.38 0.41 0.00 98.60 > >> > > >> > Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz > w/s > >> > wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s > >> %drqm > >> > d_await dareq-sz f/s f_await aqu-sz %util > >> > sdc 2.00 0.01 0.00 0.00 0.50 6.00 > 20.00 > >> > 0.04 0.00 0.00 0.50 2.00 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 10.00 0.60 0.02 1.20 > >> > sdd 3.00 0.02 0.00 0.00 0.67 8.00 > 285.00 > >> > 1.84 77.00 21.27 0.44 6.61 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 114.00 0.83 0.22 22.40 > >> > sde 1.00 0.01 0.00 0.00 1.00 8.00 > 36.00 > >> > 0.08 3.00 7.69 0.64 2.33 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 18.00 0.67 0.04 1.60 > >> > sdf 5.00 0.04 0.00 0.00 0.40 7.20 > 40.00 > >> > 0.09 3.00 6.98 0.53 2.30 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 20.00 0.70 0.04 2.00 > >> > sdg 11.00 0.08 0.00 0.00 0.73 7.27 > 36.00 > >> > 0.09 4.00 10.00 0.50 2.44 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 18.00 0.72 0.04 3.20 > >> > sdh 5.00 0.03 0.00 0.00 0.60 5.60 > 46.00 > >> > 0.10 2.00 4.17 0.59 2.17 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 23.00 0.83 0.05 2.80 > >> > sdi 7.00 0.04 0.00 0.00 0.43 6.29 > 36.00 > >> > 0.07 1.00 2.70 0.47 2.11 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 18.00 0.61 0.03 2.40 > >> > sdj 5.00 0.04 0.00 0.00 0.80 7.20 > 42.00 > >> > 0.09 1.00 2.33 0.67 2.10 0.00 0.00 0.00 > >> 0.00 > >> > 0.00 0.00 21.00 0.81 0.05 3.20 > >> > > >> > ------------------------------ > >> > Other than this 5x node cluster, I also have a 3x node cluster with > >> > identical hardware but it serves for a different purpose and data > >> workload. > >> > In this cluster I don't have any problem and MDS default settings > seems > >> > enough. > >> > The only difference between two cluster is, 5x node cluster used > directly > >> > by users, 3x node cluster used heavily to read and write data via > >> projects > >> > not by users. So allocate and de-allocate will be better. > >> > > >> > I guess I just have a problematic use case on the 5x node cluster and > as > >> I > >> > mentioned above, I might have the similar problem but I don't know > how to > >> > debug it. > >> > > >> > > >> > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/ > >> > quote:"A user running VSCodium, keeping 15k caps open.. the > opportunistic > >> > caps recall eventually starts recalling those but the (el7 kernel) > client > >> > won't release them. Stopping Codium seems to be the only way to > release." > >> > > >> > ------------------------------ > >> > Before reading the osd df you should know that I created 2x > >> > OSD/per"CT4000MX500SSD1" > >> > # ceph osd df tree > >> > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP > >> META > >> > AVAIL %USE VAR PGS STATUS TYPE NAME > >> > -1 145.54321 - 146 TiB 45 TiB 44 TiB 119 GiB > 333 > >> > GiB 101 TiB 30.81 1.00 - root default > >> > -3 29.10864 - 29 TiB 8.9 TiB 8.8 TiB 25 GiB > 66 > >> > GiB 20 TiB 30.54 0.99 - host ud-01 > >> > 0 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.4 GiB > 4.5 > >> > GiB 1.2 TiB 33.04 1.07 61 up osd.0 > >> > 1 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.5 GiB > 4.0 > >> > GiB 1.3 TiB 28.28 0.92 53 up osd.1 > >> > 2 ssd 1.81929 1.00000 1.8 TiB 595 GiB 589 GiB 2.3 GiB > 4.0 > >> > GiB 1.2 TiB 31.96 1.04 63 up osd.2 > >> > 3 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.8 GiB > 4.2 > >> > GiB 1.3 TiB 28.30 0.92 55 up osd.3 > >> > 4 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.3 GiB > 3.9 > >> > GiB 1.3 TiB 28.21 0.92 52 up osd.4 > >> > 5 ssd 1.81929 1.00000 1.8 TiB 592 GiB 586 GiB 1.8 GiB > 3.8 > >> > GiB 1.2 TiB 31.76 1.03 61 up osd.5 > >> > 6 ssd 1.81929 1.00000 1.8 TiB 559 GiB 553 GiB 1.8 GiB > 4.3 > >> > GiB 1.3 TiB 30.03 0.97 57 up osd.6 > >> > 7 ssd 1.81929 1.00000 1.8 TiB 602 GiB 597 GiB 836 MiB > 4.4 > >> > GiB 1.2 TiB 32.32 1.05 58 up osd.7 > >> > 8 ssd 1.81929 1.00000 1.8 TiB 614 GiB 609 GiB 1.2 GiB > 4.5 > >> > GiB 1.2 TiB 32.98 1.07 60 up osd.8 > >> > 9 ssd 1.81929 1.00000 1.8 TiB 571 GiB 565 GiB 2.2 GiB > 4.2 > >> > GiB 1.3 TiB 30.67 1.00 61 up osd.9 > >> > 10 ssd 1.81929 1.00000 1.8 TiB 528 GiB 522 GiB 1.3 GiB > 4.1 > >> > GiB 1.3 TiB 28.33 0.92 52 up osd.10 > >> > 11 ssd 1.81929 1.00000 1.8 TiB 551 GiB 546 GiB 1.5 GiB > 3.6 > >> > GiB 1.3 TiB 29.57 0.96 56 up osd.11 > >> > 12 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.8 GiB > 4.4 > >> > GiB 1.2 TiB 31.91 1.04 61 up osd.12 > >> > 13 ssd 1.81929 1.00000 1.8 TiB 561 GiB 555 GiB 1.1 GiB > 4.3 > >> > GiB 1.3 TiB 30.10 0.98 55 up osd.13 > >> > 14 ssd 1.81929 1.00000 1.8 TiB 616 GiB 609 GiB 1.9 GiB > 4.2 > >> > GiB 1.2 TiB 33.04 1.07 64 up osd.14 > >> > 15 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.1 GiB > 4.0 > >> > GiB 1.3 TiB 28.20 0.92 51 up osd.15 > >> > -5 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 22 GiB > 67 > >> > GiB 20 TiB 30.89 1.00 - host ud-02 > >> > 16 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.7 GiB > 4.7 > >> > GiB 1.2 TiB 33.12 1.08 63 up osd.16 > >> > 17 ssd 1.81929 1.00000 1.8 TiB 582 GiB 577 GiB 1.6 GiB > 4.0 > >> > GiB 1.3 TiB 31.26 1.01 59 up osd.17 > >> > 18 ssd 1.81929 1.00000 1.8 TiB 583 GiB 578 GiB 418 MiB > 4.0 > >> > GiB 1.3 TiB 31.29 1.02 54 up osd.18 > >> > 19 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 GiB > 4.0 > >> > GiB 1.3 TiB 29.50 0.96 56 up osd.19 > >> > 20 ssd 1.81929 1.00000 1.8 TiB 551 GiB 546 GiB 1.1 GiB > 4.1 > >> > GiB 1.3 TiB 29.57 0.96 54 up osd.20 > >> > 21 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.3 GiB > 4.4 > >> > GiB 1.2 TiB 33.04 1.07 60 up osd.21 > >> > 22 ssd 1.81929 1.00000 1.8 TiB 573 GiB 567 GiB 1.6 GiB > 4.1 > >> > GiB 1.3 TiB 30.75 1.00 58 up osd.22 > >> > 23 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.3 GiB > 4.3 > >> > GiB 1.2 TiB 33.06 1.07 60 up osd.23 > >> > 24 ssd 1.81929 1.00000 1.8 TiB 539 GiB 534 GiB 844 MiB > 3.8 > >> > GiB 1.3 TiB 28.92 0.94 51 up osd.24 > >> > 25 ssd 1.81929 1.00000 1.8 TiB 583 GiB 576 GiB 2.1 GiB > 4.1 > >> > GiB 1.3 TiB 31.27 1.02 61 up osd.25 > >> > 26 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.3 GiB > 4.6 > >> > GiB 1.2 TiB 33.12 1.08 61 up osd.26 > >> > 27 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 1.2 GiB > 4.1 > >> > GiB 1.3 TiB 28.84 0.94 53 up osd.27 > >> > 28 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.3 GiB > 4.2 > >> > GiB 1.3 TiB 28.29 0.92 53 up osd.28 > >> > 29 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.5 GiB > 4.6 > >> > GiB 1.2 TiB 31.91 1.04 59 up osd.29 > >> > 30 ssd 1.81929 1.00000 1.8 TiB 528 GiB 523 GiB 1.4 GiB > 4.1 > >> > GiB 1.3 TiB 28.35 0.92 53 up osd.30 > >> > 31 ssd 1.81929 1.00000 1.8 TiB 594 GiB 589 GiB 1.6 GiB > 3.8 > >> > GiB 1.2 TiB 31.89 1.03 61 up osd.31 > >> > -7 29.10864 - 29 TiB 8.9 TiB 8.8 TiB 23 GiB > 67 > >> > GiB 20 TiB 30.66 1.00 - host ud-03 > >> > 32 ssd 1.81929 1.00000 1.8 TiB 593 GiB 588 GiB 1.1 GiB > 4.3 > >> > GiB 1.2 TiB 31.84 1.03 57 up osd.32 > >> > 33 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.8 GiB > 4.4 > >> > GiB 1.2 TiB 33.13 1.08 63 up osd.33 > >> > 34 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 2.0 GiB > 3.8 > >> > GiB 1.3 TiB 28.84 0.94 59 up osd.34 > >> > 35 ssd 1.81929 1.00000 1.8 TiB 562 GiB 556 GiB 1.7 GiB > 4.2 > >> > GiB 1.3 TiB 30.16 0.98 58 up osd.35 > >> > 36 ssd 1.81929 1.00000 1.8 TiB 529 GiB 523 GiB 1.3 GiB > 3.9 > >> > GiB 1.3 TiB 28.38 0.92 52 up osd.36 > >> > 37 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.7 GiB > 4.2 > >> > GiB 1.3 TiB 28.28 0.92 55 up osd.37 > >> > 38 ssd 1.81929 1.00000 1.8 TiB 574 GiB 568 GiB 1.2 GiB > 4.3 > >> > GiB 1.3 TiB 30.79 1.00 55 up osd.38 > >> > 39 ssd 1.81929 1.00000 1.8 TiB 605 GiB 599 GiB 1.6 GiB > 4.2 > >> > GiB 1.2 TiB 32.48 1.05 61 up osd.39 > >> > 40 ssd 1.81929 1.00000 1.8 TiB 573 GiB 567 GiB 1.2 GiB > 4.4 > >> > GiB 1.3 TiB 30.76 1.00 56 up osd.40 > >> > 41 ssd 1.81929 1.00000 1.8 TiB 526 GiB 520 GiB 1.7 GiB > 3.9 > >> > GiB 1.3 TiB 28.21 0.92 54 up osd.41 > >> > 42 ssd 1.81929 1.00000 1.8 TiB 613 GiB 608 GiB 1010 MiB > 4.4 > >> > GiB 1.2 TiB 32.91 1.07 58 up osd.42 > >> > 43 ssd 1.81929 1.00000 1.8 TiB 606 GiB 600 GiB 1.7 GiB > 4.3 > >> > GiB 1.2 TiB 32.51 1.06 61 up osd.43 > >> > 44 ssd 1.81929 1.00000 1.8 TiB 583 GiB 577 GiB 1.6 GiB > 4.2 > >> > GiB 1.3 TiB 31.29 1.02 60 up osd.44 > >> > 45 ssd 1.81929 1.00000 1.8 TiB 618 GiB 613 GiB 1.4 GiB > 4.3 > >> > GiB 1.2 TiB 33.18 1.08 62 up osd.45 > >> > 46 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 GiB > 4.2 > >> > GiB 1.3 TiB 29.50 0.96 54 up osd.46 > >> > 47 ssd 1.81929 1.00000 1.8 TiB 526 GiB 522 GiB 692 MiB > 3.7 > >> > GiB 1.3 TiB 28.25 0.92 50 up osd.47 > >> > -9 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 26 GiB > 68 > >> > GiB 20 TiB 31.04 1.01 - host ud-04 > >> > 48 ssd 1.81929 1.00000 1.8 TiB 540 GiB 534 GiB 2.2 GiB > 3.6 > >> > GiB 1.3 TiB 28.96 0.94 58 up osd.48 > >> > 49 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.4 GiB > 4.5 > >> > GiB 1.2 TiB 33.11 1.07 61 up osd.49 > >> > 50 ssd 1.81929 1.00000 1.8 TiB 618 GiB 612 GiB 1.2 GiB > 4.8 > >> > GiB 1.2 TiB 33.17 1.08 61 up osd.50 > >> > 51 ssd 1.81929 1.00000 1.8 TiB 618 GiB 612 GiB 1.5 GiB > 4.5 > >> > GiB 1.2 TiB 33.19 1.08 61 up osd.51 > >> > 52 ssd 1.81929 1.00000 1.8 TiB 526 GiB 521 GiB 1.4 GiB > 4.1 > >> > GiB 1.3 TiB 28.25 0.92 53 up osd.52 > >> > 53 ssd 1.81929 1.00000 1.8 TiB 618 GiB 611 GiB 2.4 GiB > 4.3 > >> > GiB 1.2 TiB 33.17 1.08 66 up osd.53 > >> > 54 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 GiB > 4.3 > >> > GiB 1.3 TiB 29.54 0.96 55 up osd.54 > >> > 55 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.3 GiB > 4.0 > >> > GiB 1.3 TiB 28.29 0.92 52 up osd.55 > >> > 56 ssd 1.81929 1.00000 1.8 TiB 525 GiB 519 GiB 1.2 GiB > 4.1 > >> > GiB 1.3 TiB 28.16 0.91 52 up osd.56 > >> > 57 ssd 1.81929 1.00000 1.8 TiB 615 GiB 609 GiB 2.3 GiB > 4.2 > >> > GiB 1.2 TiB 33.03 1.07 65 up osd.57 > >> > 58 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.6 GiB > 3.7 > >> > GiB 1.3 TiB 28.31 0.92 55 up osd.58 > >> > 59 ssd 1.81929 1.00000 1.8 TiB 615 GiB 609 GiB 1.2 GiB > 4.6 > >> > GiB 1.2 TiB 33.01 1.07 60 up osd.59 > >> > 60 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.2 GiB > 4.4 > >> > GiB 1.2 TiB 31.88 1.03 59 up osd.60 > >> > 61 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.9 GiB > 4.1 > >> > GiB 1.2 TiB 33.04 1.07 64 up osd.61 > >> > 62 ssd 1.81929 1.00000 1.8 TiB 620 GiB 614 GiB 1.9 GiB > 4.4 > >> > GiB 1.2 TiB 33.27 1.08 63 up osd.62 > >> > 63 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.5 GiB > 4.0 > >> > GiB 1.3 TiB 28.30 0.92 53 up osd.63 > >> > -11 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 23 GiB > 65 > >> > GiB 20 TiB 30.91 1.00 - host ud-05 > >> > 64 ssd 1.81929 1.00000 1.8 TiB 608 GiB 601 GiB 2.3 GiB > 4.5 > >> > GiB 1.2 TiB 32.62 1.06 65 up osd.64 > >> > 65 ssd 1.81929 1.00000 1.8 TiB 606 GiB 601 GiB 628 MiB > 4.2 > >> > GiB 1.2 TiB 32.53 1.06 57 up osd.65 > >> > 66 ssd 1.81929 1.00000 1.8 TiB 583 GiB 578 GiB 1.3 GiB > 4.3 > >> > GiB 1.2 TiB 31.31 1.02 57 up osd.66 > >> > 67 ssd 1.81929 1.00000 1.8 TiB 537 GiB 533 GiB 436 MiB > 3.6 > >> > GiB 1.3 TiB 28.82 0.94 50 up osd.67 > >> > 68 ssd 1.81929 1.00000 1.8 TiB 541 GiB 535 GiB 2.5 GiB > 3.8 > >> > GiB 1.3 TiB 29.04 0.94 59 up osd.68 > >> > 69 ssd 1.81929 1.00000 1.8 TiB 606 GiB 601 GiB 1.1 GiB > 4.4 > >> > GiB 1.2 TiB 32.55 1.06 59 up osd.69 > >> > 70 ssd 1.81929 1.00000 1.8 TiB 604 GiB 598 GiB 1.8 GiB > 4.1 > >> > GiB 1.2 TiB 32.44 1.05 63 up osd.70 > >> > 71 ssd 1.81929 1.00000 1.8 TiB 606 GiB 600 GiB 1.9 GiB > 4.5 > >> > GiB 1.2 TiB 32.53 1.06 62 up osd.71 > >> > 72 ssd 1.81929 1.00000 1.8 TiB 602 GiB 598 GiB 612 MiB > 4.1 > >> > GiB 1.2 TiB 32.33 1.05 57 up osd.72 > >> > 73 ssd 1.81929 1.00000 1.8 TiB 571 GiB 565 GiB 1.8 GiB > 4.5 > >> > GiB 1.3 TiB 30.65 0.99 58 up osd.73 > >> > 74 ssd 1.81929 1.00000 1.8 TiB 608 GiB 602 GiB 1.8 GiB > 4.2 > >> > GiB 1.2 TiB 32.62 1.06 61 up osd.74 > >> > 75 ssd 1.81929 1.00000 1.8 TiB 536 GiB 531 GiB 1.9 GiB > 3.5 > >> > GiB 1.3 TiB 28.80 0.93 57 up osd.75 > >> > 76 ssd 1.81929 1.00000 1.8 TiB 605 GiB 599 GiB 1.4 GiB > 4.5 > >> > GiB 1.2 TiB 32.48 1.05 60 up osd.76 > >> > 77 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 1.2 GiB > 3.9 > >> > GiB 1.3 TiB 28.84 0.94 52 up osd.77 > >> > 78 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.3 GiB > 3.8 > >> > GiB 1.3 TiB 28.20 0.92 52 up osd.78 > >> > 79 ssd 1.81929 1.00000 1.8 TiB 536 GiB 531 GiB 1.1 GiB > 3.3 > >> > GiB 1.3 TiB 28.76 0.93 53 up osd.79 > >> > TOTAL 146 TiB 45 TiB 44 TiB 119 GiB > 333 > >> > GiB 101 TiB 30.81 > >> > MIN/MAX VAR: 0.91/1.08 STDDEV: 1.90 > >> > > >> > > >> > > >> > Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 16:52 tarihinde şunu > >> yazdı: > >> > > >> >> There is no definitive answer wrt mds tuning. As it is everywhere > >> >> mentioned, it's about finding the right setup for your specific > >> >> workload. If you can synthesize your workload (maybe scale down a > bit) > >> >> try optimizing it in a test cluster without interrupting your > >> >> developers too much. > >> >> But what you haven't explained yet is what are you experiencing as a > >> >> performance issue? Do you have numbers or a detailed description? > >> >> From the fs status output you didn't seem to have too much activity > >> >> going on (around 140 requests per second), but that's probably not > the > >> >> usual traffic? What does ceph report in its client IO output? > >> >> Can you paste the 'ceph osd df' output as well? > >> >> Do you have dedicated MDS servers or are they colocated with other > >> >> services? > >> >> > >> >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: > >> >> > >> >> > Hello Eugen. > >> >> > > >> >> > I read all of your MDS related topics and thank you so much for > your > >> >> effort > >> >> > on this. > >> >> > There is not much information and I couldn't find a MDS tuning > guide > >> at > >> >> > all. It seems that you are the correct person to discuss mds > >> debugging > >> >> and > >> >> > tuning. > >> >> > > >> >> > Do you have any documents or may I learn what is the proper way to > >> debug > >> >> > MDS and clients ? > >> >> > Which debug logs will guide me to understand the limitations and > will > >> >> help > >> >> > to tune according to the data flow? > >> >> > > >> >> > While searching, I find this: > >> >> > > >> >> > >> > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/ > >> >> > quote:"A user running VSCodium, keeping 15k caps open.. the > >> opportunistic > >> >> > caps recall eventually starts recalling those but the (el7 kernel) > >> client > >> >> > won't release them. Stopping Codium seems to be the only way to > >> release." > >> >> > > >> >> > Because of this I think I also need to play around with the client > >> side > >> >> too. > >> >> > > >> >> > My main goal is increasing the speed and reducing the latency and I > >> >> wonder > >> >> > if these ideas are correct or not: > >> >> > - Maybe I need to increase client side cache size because via each > >> >> client, > >> >> > multiple users request a lot of objects and clearly the > >> >> > client_cache_size=16 default is not enough. > >> >> > - Maybe I need to increase client side maximum cache limit for > >> >> > object "client_oc_max_objects=1000 to 10000" and data > >> >> "client_oc_size=200mi > >> >> > to 400mi" > >> >> > - The client cache cleaning threshold is not aggressive enough to > keep > >> >> the > >> >> > free cache size in the desired range. I need to make it aggressive > but > >> >> this > >> >> > should not reduce speed and increase latency. > >> >> > > >> >> > mds_cache_memory_limit=4gi to 16gi > >> >> > client_oc_max_objects=1000 to 10000 > >> >> > client_oc_size=200mi to 400mi > >> >> > client_permissions=false #to reduce latency. > >> >> > client_cache_size=16 to 128 > >> >> > > >> >> > > >> >> > What do you think? > >> >> > >> >> > >> >> > >> >> > >> > >> > >> > >> > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx