I started to investigate my clients. for example: root@ud-01:~# ceph health detail HEALTH_WARN 1 clients failing to respond to cache pressure [WRN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure mds.ud-data.ud-02.xcoojt(mds.0): Client bmw-m4 failing to respond to cache pressure client_id: 1275577 root@ud-01:~# ceph fs status ud-data - 86 clients ======= RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active ud-data.ud-02.xcoojt Reqs: 34 /s 2926k 2827k 155k 1157k ceph tell mds.ud-data.ud-02.xcoojt session ls | jq -r '.[] | "clientid: \(.id)= num_caps: \(.num_caps), num_leases: \(.num_leases), request_load_avg: \(.request_load_avg), num_completed_requests: \(.num_completed_requests), num_completed_flushes: \(.num_completed_flushes)"' | sort -n -t: -k3 clientid: *1275577*= num_caps: 12312, num_leases: 0, request_load_avg: 0, num_completed_requests: 0, num_completed_flushes: 1 clientid: 1275571= num_caps: 16307, num_leases: 1, request_load_avg: 2101, num_completed_requests: 0, num_completed_flushes: 3 clientid: 1282130= num_caps: 26337, num_leases: 3, request_load_avg: 116, num_completed_requests: 0, num_completed_flushes: 1 clientid: 1191789= num_caps: 32784, num_leases: 0, request_load_avg: 1846, num_completed_requests: 0, num_completed_flushes: 0 clientid: 1275535= num_caps: 79825, num_leases: 2, request_load_avg: 133, num_completed_requests: 8, num_completed_flushes: 8 clientid: 1282142= num_caps: 80581, num_leases: 6, request_load_avg: 125, num_completed_requests: 2, num_completed_flushes: 6 clientid: 1275532= num_caps: 87836, num_leases: 3, request_load_avg: 190, num_completed_requests: 2, num_completed_flushes: 6 clientid: 1275547= num_caps: 94129, num_leases: 4, request_load_avg: 149, num_completed_requests: 2, num_completed_flushes: 4 clientid: 1275553= num_caps: 96460, num_leases: 4, request_load_avg: 155, num_completed_requests: 2, num_completed_flushes: 8 clientid: 1282139= num_caps: 108882, num_leases: 25, request_load_avg: 99, num_completed_requests: 2, num_completed_flushes: 4 clientid: 1275538= num_caps: 437162, num_leases: 0, request_load_avg: 101, num_completed_requests: 2, num_completed_flushes: 0 -------------------------------------- *MY CLIENT:* The client is actually at idle mode and there is no reason to fail at all. root@bmw-m4:~# apt list --installed |grep ceph ceph-common/jammy-updates,now 17.2.6-0ubuntu0.22.04.2 amd64 [installed] libcephfs2/jammy-updates,now 17.2.6-0ubuntu0.22.04.2 amd64 [installed,automatic] python3-ceph-argparse/jammy-updates,now 17.2.6-0ubuntu0.22.04.2 amd64 [installed,automatic] python3-ceph-common/jammy-updates,now 17.2.6-0ubuntu0.22.04.2 all [installed,automatic] python3-cephfs/jammy-updates,now 17.2.6-0ubuntu0.22.04.2 amd64 [installed,automatic] Let's check metrics and stats: root@bmw-m4:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1275577# cat metrics item total ------------------------------------------ opened files / total inodes 2 / 12312 pinned i_caps / total inodes 12312 / 12312 opened inodes / total inodes 1 / 12312 item total avg_lat(us) min_lat(us) max_lat(us) stdev(us) ----------------------------------------------------------------------------------- read 22283 44409 430 1804853 15619 write 112702 419725 3658 8879541 6008 metadata 353322 5712 154 917903 5357 item total avg_sz(bytes) min_sz(bytes) max_sz(bytes) total_sz(bytes) ---------------------------------------------------------------------------------------- read 22283 1701940 1 4194304 37924318602 write 112702 246211 1 4194304 27748469309 item total miss hit ------------------------------------------------- d_lease 62 63627 28564698 caps 12312 36658 44568261 root@bmw-m4:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1275577# cat bdi/stats BdiWriteback: 0 kB BdiReclaimable: 800 kB BdiDirtyThresh: 0 kB DirtyThresh: 5795340 kB BackgroundThresh: 2894132 kB BdiDirtied: 27316320 kB BdiWritten: 27316320 kB BdiWriteBandwidth: 1472 kBps b_dirty: 0 b_io: 0 b_more_io: 0 b_dirty_time: 0 bdi_list: 1 state: 1 Last 3 days dmesg output: [Wed Jan 24 16:45:13 2024] xfsettingsd[653036]: segfault at 18 ip 00007fbd12f5d337 sp 00007ffd254332a0 error 4 in libxklavier.so.16.4.0[7fbd12f4d000+19000] [Wed Jan 24 16:45:13 2024] Code: 4c 89 e7 e8 0b 56 ff ff 48 89 03 48 8b 5c 24 30 e9 d1 fd ff ff e8 b9 5b ff ff 66 0f 1f 84 00 00 00 00 00 41 54 55 48 89 f5 53 <48> 8b 42 18 48 89 d1 49 89 fc 48 89 d3 48 89 fa 48 89 ef 48 8b b0 [Thu Jan 25 06:51:31 2024] NVRM: GPU at PCI:0000:81:00: GPU-02efbb18-c9e4-3a16-d615-598959520b99 [Thu Jan 25 06:51:31 2024] NVRM: GPU Board Serial Number: 1321421015411 [Thu Jan 25 06:51:31 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=683281, name=python, Ch 00000008 [Thu Jan 25 06:56:49 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=683377, name=python, Ch 00000018 [Thu Jan 25 20:14:13 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=696062, name=python, Ch 00000008 [Fri Jan 26 04:05:40 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=700166, name=python, Ch 00000008 [Fri Jan 26 05:05:12 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=700320, name=python, Ch 00000008 [Fri Jan 26 05:44:50 2024] NVRM: GPU at PCI:0000:82:00: GPU-3af62a2c-e7eb-a7d5-c073-22f06dc7065f [Fri Jan 26 05:44:50 2024] NVRM: GPU Board Serial Number: 1321421010400 [Fri Jan 26 05:44:50 2024] NVRM: Xid (PCI:0000:82:00): 43, pid=700757, name=python, Ch 00000018 [Fri Jan 26 05:56:02 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=701096, name=python, Ch 00000028 [Fri Jan 26 06:34:20 2024] NVRM: Xid (PCI:0000:81:00): 43, pid=701226, name=python, Ch 00000038 root@bmw-m4:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1275577# free -h total used free shared buff/cache available Mem: 62Gi 34Gi 27Gi 0.0Ki 639Mi 27Gi Swap: 1.8Ti 18Gi 1.8Ti root@bmw-m4:/sys/kernel/debug/ceph/e42fd4b0-313b-11ee-9a00-31da71873773.client1275577# cat /proc/vmstat nr_free_pages 7231171 nr_zone_inactive_anon 7924766 nr_zone_active_anon 525190 nr_zone_inactive_file 44029 nr_zone_active_file 55966 nr_zone_unevictable 13042 nr_zone_write_pending 3 nr_mlock 13042 nr_bounce 0 nr_zspages 0 nr_free_cma 0 numa_hit 6701928919 numa_miss 312628341 numa_foreign 312628341 numa_interleave 31538 numa_local 6701864751 numa_other 312692567 nr_inactive_anon 7924766 nr_active_anon 525190 nr_inactive_file 44029 nr_active_file 55966 nr_unevictable 13042 nr_slab_reclaimable 61076 nr_slab_unreclaimable 63509 nr_isolated_anon 0 nr_isolated_file 0 workingset_nodes 3934 workingset_refault_anon 30325493 workingset_refault_file 14593094 workingset_activate_anon 5376050 workingset_activate_file 3250679 workingset_restore_anon 292317 workingset_restore_file 1166673 workingset_nodereclaim 488665 nr_anon_pages 8451968 nr_mapped 35731 nr_file_pages 138824 nr_dirty 3 nr_writeback 0 nr_writeback_temp 0 nr_shmem 242 nr_shmem_hugepages 0 nr_shmem_pmdmapped 0 nr_file_hugepages 0 nr_file_pmdmapped 0 nr_anon_transparent_hugepages 3588 nr_vmscan_write 33746573 nr_vmscan_immediate_reclaim 160 nr_dirtied 48165341 nr_written 80207893 nr_kernel_misc_reclaimable 0 nr_foll_pin_acquired 174002 nr_foll_pin_released 174002 nr_kernel_stack 60032 nr_page_table_pages 46041 nr_swapcached 36166 nr_dirty_threshold 1448010 nr_dirty_background_threshold 723121 pgpgin 129904699 pgpgout 299261581 pswpin 30325493 pswpout 45158221 pgalloc_dma 1024 pgalloc_dma32 57788566 pgalloc_normal 6956384725 pgalloc_movable 0 allocstall_dma 0 allocstall_dma32 0 allocstall_normal 188 allocstall_movable 63024 pgskip_dma 0 pgskip_dma32 0 pgskip_normal 0 pgskip_movable 0 pgfree 7222273815 pgactivate 1371753960 pgdeactivate 18329381 pglazyfree 10 pgfault 7795723861 pgmajfault 4600007 pglazyfreed 0 pgrefill 18575528 pgreuse 81910383 pgsteal_kswapd 980532060 pgsteal_direct 38942066 pgdemote_kswapd 0 pgdemote_direct 0 pgscan_kswapd 1135293298 pgscan_direct 58883653 pgscan_direct_throttle 15 pgscan_anon 220939938 pgscan_file 973237013 pgsteal_anon 46538607 pgsteal_file 972935519 zone_reclaim_failed 0 pginodesteal 0 slabs_scanned 25879882 kswapd_inodesteal 2179831 kswapd_low_wmark_hit_quickly 152797 kswapd_high_wmark_hit_quickly 32025 pageoutrun 204447 pgrotated 44963935 drop_pagecache 0 drop_slab 0 oom_kill 0 numa_pte_updates 2724410955 numa_huge_pte_updates 1695890 numa_hint_faults 1739823254 numa_hint_faults_local 1222358972 numa_pages_migrated 312611639 pgmigrate_success 510846802 pgmigrate_fail 875493 thp_migration_success 156413 thp_migration_fail 2 thp_migration_split 0 compact_migrate_scanned 1274073243 compact_free_scanned 8430842597 compact_isolated 400278352 compact_stall 145300 compact_fail 128562 compact_success 16738 compact_daemon_wake 170247 compact_daemon_migrate_scanned 35486283 compact_daemon_free_scanned 369870412 htlb_buddy_alloc_success 0 htlb_buddy_alloc_fail 0 unevictable_pgs_culled 2774290 unevictable_pgs_scanned 0 unevictable_pgs_rescued 2675031 unevictable_pgs_mlocked 2813622 unevictable_pgs_munlocked 2674972 unevictable_pgs_cleared 84231 unevictable_pgs_stranded 84225 thp_fault_alloc 416468 thp_fault_fallback 19181 thp_fault_fallback_charge 0 thp_collapse_alloc 17931 thp_collapse_alloc_failed 76 thp_file_alloc 0 thp_file_fallback 0 thp_file_fallback_charge 0 thp_file_mapped 0 thp_split_page 2 thp_split_page_failed 0 thp_deferred_split_page 66 thp_split_pmd 22451 thp_split_pud 0 thp_zero_page_alloc 1 thp_zero_page_alloc_failed 0 thp_swpout 22332 thp_swpout_fallback 0 balloon_inflate 0 balloon_deflate 0 balloon_migrate 0 swap_ra 25777929 swap_ra_hit 25658825 direct_map_level2_splits 1249 direct_map_level3_splits 49 nr_unstable 0 Özkan Göksu <ozkangksu@xxxxxxxxx>, 27 Oca 2024 Cmt, 02:36 tarihinde şunu yazdı: > Hello Frank. > > I have 84 clients (high-end servers) with: Ubuntu 20.04.5 LTS - Kernel: > Linux 5.4.0-125-generic > > My cluster 17.2.6 quincy. > I have some client nodes with "ceph-common/stable,now 17.2.7-1focal" I > wonder using new version clients is the main problem? > Maybe I have a communication error. For example I hit this problem and I > can not collect client stats " > https://github.com/ceph/ceph/pull/52127/files" > > Best regards. > > > > Frank Schilder <frans@xxxxxx>, 26 Oca 2024 Cum, 14:53 tarihinde şunu > yazdı: > >> Hi, this message is one of those that are often spurious. I don't recall >> in which thread/PR/tracker I read it, but the story was something like that: >> >> If an MDS gets under memory pressure it will request dentry items back >> from *all* clients, not just the active ones or the ones holding many of >> them. If you have a client that's below the min-threshold for dentries (its >> one of the client/mds tuning options), it will not respond. This client >> will be flagged as not responding, which is a false positive. >> >> I believe the devs are working on a fix to get rid of these spurious >> warnings. There is a "bug/feature" in the MDS that does not clear this >> warning flag for inactive clients. Hence, the message hangs and never >> disappears. I usually clear it with a "echo 3 > /proc/sys/vm/drop_caches" >> on the client. However, except for being annoying in the dashboard, it has >> no performance or otherwise negative impact. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Eugen Block <eblock@xxxxxx> >> Sent: Friday, January 26, 2024 10:05 AM >> To: Özkan Göksu >> Cc: ceph-users@xxxxxxx >> Subject: Re: 1 clients failing to respond to cache pressure >> (quincy:17.2.6) >> >> Performance for small files is more about IOPS rather than throughput, >> and the IOPS in your fio tests look okay to me. What you could try is >> to split the PGs to get around 150 or 200 PGs per OSD. You're >> currently at around 60 according to the ceph osd df output. Before you >> do that, can you share 'ceph pg ls-by-pool cephfs.ud-data.data | >> head'? I don't need the whole output, just to see how many objects >> each PG has. We had a case once where that helped, but it was an older >> cluster and the pool was backed by HDDs and separate rocksDB on SSDs. >> So this might not be the solution here, but it could improve things as >> well. >> >> >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: >> >> > Every user has a 1x subvolume and I only have 1 pool. >> > At the beginning we were using each subvolume for ldap home directory + >> > user data. >> > When a user logins any docker on any host, it was using the cluster for >> > home and the for user related data, we was have second directory in the >> > same subvolume. >> > Time to time users were feeling a very slow home environment and after a >> > month it became almost impossible to use home. VNC sessions became >> > unresponsive and slow etc. >> > >> > 2 weeks ago, I had to migrate home to a ZFS storage and now the overall >> > performance is better for only user_data without home. >> > But still the performance is not good enough as I expected because of >> the >> > problems related to MDS. >> > The usage is low but allocation is high and Cpu usage is high. You saw >> the >> > IO Op/s, it's nothing but allocation is high. >> > >> > I develop a fio benchmark script and I run the script on 4x test server >> at >> > the same time, the results are below: >> > Script: >> > >> https://github.com/ozkangoksu/benchmark/blob/8f5df87997864c25ef32447e02fcd41fda0d2a67/iobench.sh >> > >> > >> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-01.txt >> > >> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-02.txt >> > >> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-03.txt >> > >> https://github.com/ozkangoksu/benchmark/blob/main/benchmark-results/iobench-client-04.txt >> > >> > While running benchmark, I take sample values for each type of iobench >> run. >> > >> > Seq Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 >> > client: 70 MiB/s rd, 762 MiB/s wr, 337 op/s rd, 24.41k op/s wr >> > client: 60 MiB/s rd, 551 MiB/s wr, 303 op/s rd, 35.12k op/s wr >> > client: 13 MiB/s rd, 161 MiB/s wr, 101 op/s rd, 41.30k op/s wr >> > >> > Seq Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 >> > client: 1.6 GiB/s rd, 219 KiB/s wr, 28.76k op/s rd, 89 op/s wr >> > client: 370 MiB/s rd, 475 KiB/s wr, 90.38k op/s rd, 89 op/s wr >> > >> > Rand Write benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 >> > client: 63 MiB/s rd, 1.5 GiB/s wr, 8.77k op/s rd, 5.50k op/s wr >> > client: 14 MiB/s rd, 1.8 GiB/s wr, 81 op/s rd, 13.86k op/s wr >> > client: 6.6 MiB/s rd, 1.2 GiB/s wr, 61 op/s rd, 30.13k op/s wr >> > >> > Rand Read benchmarking: size=1G,direct=1,numjobs=3,iodepth=32 >> > client: 317 MiB/s rd, 841 MiB/s wr, 426 op/s rd, 10.98k op/s wr >> > client: 2.8 GiB/s rd, 882 MiB/s wr, 25.68k op/s rd, 291 op/s wr >> > client: 4.0 GiB/s rd, 226 MiB/s wr, 89.63k op/s rd, 124 op/s wr >> > client: 2.4 GiB/s rd, 295 KiB/s wr, 197.86k op/s rd, 20 op/s wr >> > >> > It seems I only have problems with the 4K,8K,16K other sector sizes. >> > >> > >> > >> > >> > Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 19:06 tarihinde şunu >> yazdı: >> > >> >> I understand that your MDS shows a high CPU usage, but other than that >> >> what is your performance issue? Do users complain? Do some operations >> >> take longer than expected? Are OSDs saturated during those phases? >> >> Because the cache pressure messages don’t necessarily mean that users >> >> will notice. >> >> MDS daemons are single-threaded so that might be a bottleneck. In that >> >> case multi-active mds might help, which you already tried and >> >> experienced OOM killers. But you might have to disable the mds >> >> balancer as someone else mentioned. And then you could think about >> >> pinning, is it possible to split the CephFS into multiple >> >> subdirectories and pin them to different ranks? >> >> But first I’d still like to know what the performance issue really is. >> >> >> >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: >> >> >> >> > I will try my best to explain my situation. >> >> > >> >> > I don't have a separate mds server. I have 5 identical nodes, 3 of >> them >> >> > mons, and I use the other 2 as active and standby mds. (currently I >> have >> >> > left overs from max_mds 4) >> >> > >> >> > root@ud-01:~# ceph -s >> >> > cluster: >> >> > id: e42fd4b0-313b-11ee-9a00-31da71873773 >> >> > health: HEALTH_WARN >> >> > 1 clients failing to respond to cache pressure >> >> > >> >> > services: >> >> > mon: 3 daemons, quorum ud-01,ud-02,ud-03 (age 9d) >> >> > mgr: ud-01.qycnol(active, since 8d), standbys: ud-02.tfhqfd >> >> > mds: 1/1 daemons up, 4 standby >> >> > osd: 80 osds: 80 up (since 9d), 80 in (since 5M) >> >> > >> >> > data: >> >> > volumes: 1/1 healthy >> >> > pools: 3 pools, 2305 pgs >> >> > objects: 106.58M objects, 25 TiB >> >> > usage: 45 TiB used, 101 TiB / 146 TiB avail >> >> > pgs: 2303 active+clean >> >> > 2 active+clean+scrubbing+deep >> >> > >> >> > io: >> >> > client: 16 MiB/s rd, 3.4 MiB/s wr, 77 op/s rd, 23 op/s wr >> >> > >> >> > ------------------------------ >> >> > root@ud-01:~# ceph fs status >> >> > ud-data - 84 clients >> >> > ======= >> >> > RANK STATE MDS ACTIVITY DNS INOS >> DIRS >> >> > CAPS >> >> > 0 active ud-data.ud-02.xcoojt Reqs: 40 /s 2579k 2578k >> 169k >> >> > 3048k >> >> > POOL TYPE USED AVAIL >> >> > cephfs.ud-data.meta metadata 136G 44.9T >> >> > cephfs.ud-data.data data 44.3T 44.9T >> >> > >> >> > ------------------------------ >> >> > root@ud-01:~# ceph health detail >> >> > HEALTH_WARN 1 clients failing to respond to cache pressure >> >> > [WRN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache >> pressure >> >> > mds.ud-data.ud-02.xcoojt(mds.0): Client bmw-m4 failing to >> respond to >> >> > cache pressure client_id: 1275577 >> >> > >> >> > ------------------------------ >> >> > When I check the failing client with session ls I see only "num_caps: >> >> 12298" >> >> > >> >> > ceph tell mds.ud-data.ud-02.xcoojt session ls | jq -r '.[] | >> "clientid: >> >> > \(.id)= num_caps: \(.num_caps), num_leases: \(.num_leases), >> >> > request_load_avg: \(.request_load_avg), num_completed_requests: >> >> > \(.num_completed_requests), num_completed_flushes: >> >> > \(.num_completed_flushes)"' | sort -n -t: -k3 >> >> > >> >> > clientid: 1275577= num_caps: 12298, num_leases: 0, request_load_avg: >> 0, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > clientid: 1294542= num_caps: 13000, num_leases: 12, request_load_avg: >> >> 105, >> >> > num_completed_requests: 0, num_completed_flushes: 6 >> >> > clientid: 1282187= num_caps: 16869, num_leases: 1, request_load_avg: >> 0, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > clientid: 1275589= num_caps: 18943, num_leases: 0, request_load_avg: >> 52, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > clientid: 1282154= num_caps: 24747, num_leases: 1, request_load_avg: >> 57, >> >> > num_completed_requests: 2, num_completed_flushes: 2 >> >> > clientid: 1275553= num_caps: 25120, num_leases: 2, request_load_avg: >> 116, >> >> > num_completed_requests: 2, num_completed_flushes: 8 >> >> > clientid: 1282142= num_caps: 27185, num_leases: 6, request_load_avg: >> 128, >> >> > num_completed_requests: 0, num_completed_flushes: 8 >> >> > clientid: 1275535= num_caps: 40364, num_leases: 6, request_load_avg: >> 111, >> >> > num_completed_requests: 2, num_completed_flushes: 8 >> >> > clientid: 1282130= num_caps: 41483, num_leases: 0, request_load_avg: >> 135, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > clientid: 1275547= num_caps: 42953, num_leases: 4, request_load_avg: >> 119, >> >> > num_completed_requests: 2, num_completed_flushes: 6 >> >> > clientid: 1282139= num_caps: 45435, num_leases: 27, >> request_load_avg: 84, >> >> > num_completed_requests: 2, num_completed_flushes: 34 >> >> > clientid: 1282136= num_caps: 48374, num_leases: 8, request_load_avg: >> 0, >> >> > num_completed_requests: 1, num_completed_flushes: 1 >> >> > clientid: 1275532= num_caps: 48664, num_leases: 7, request_load_avg: >> 115, >> >> > num_completed_requests: 2, num_completed_flushes: 8 >> >> > clientid: 1191789= num_caps: 130319, num_leases: 0, request_load_avg: >> >> 1753, >> >> > num_completed_requests: 0, num_completed_flushes: 0 >> >> > clientid: 1275571= num_caps: 139488, num_leases: 0, >> request_load_avg: 2, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > clientid: 1282133= num_caps: 145487, num_leases: 0, >> request_load_avg: 8, >> >> > num_completed_requests: 1, num_completed_flushes: 1 >> >> > clientid: 1534496= num_caps: 1041316, num_leases: 0, >> request_load_avg: 0, >> >> > num_completed_requests: 0, num_completed_flushes: 1 >> >> > >> >> > ------------------------------ >> >> > When I check the dashboard/service/mds I see %120+ CPU usage on >> active >> >> MDS >> >> > but on the host everything is almost idle and disk waits are very >> low. >> >> > >> >> > avg-cpu: %user %nice %system %iowait %steal %idle >> >> > 0.61 0.00 0.38 0.41 0.00 98.60 >> >> > >> >> > Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz >> w/s >> >> > wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s >> >> %drqm >> >> > d_await dareq-sz f/s f_await aqu-sz %util >> >> > sdc 2.00 0.01 0.00 0.00 0.50 6.00 >> 20.00 >> >> > 0.04 0.00 0.00 0.50 2.00 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 10.00 0.60 0.02 1.20 >> >> > sdd 3.00 0.02 0.00 0.00 0.67 8.00 >> 285.00 >> >> > 1.84 77.00 21.27 0.44 6.61 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 114.00 0.83 0.22 22.40 >> >> > sde 1.00 0.01 0.00 0.00 1.00 8.00 >> 36.00 >> >> > 0.08 3.00 7.69 0.64 2.33 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 18.00 0.67 0.04 1.60 >> >> > sdf 5.00 0.04 0.00 0.00 0.40 7.20 >> 40.00 >> >> > 0.09 3.00 6.98 0.53 2.30 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 20.00 0.70 0.04 2.00 >> >> > sdg 11.00 0.08 0.00 0.00 0.73 7.27 >> 36.00 >> >> > 0.09 4.00 10.00 0.50 2.44 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 18.00 0.72 0.04 3.20 >> >> > sdh 5.00 0.03 0.00 0.00 0.60 5.60 >> 46.00 >> >> > 0.10 2.00 4.17 0.59 2.17 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 23.00 0.83 0.05 2.80 >> >> > sdi 7.00 0.04 0.00 0.00 0.43 6.29 >> 36.00 >> >> > 0.07 1.00 2.70 0.47 2.11 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 18.00 0.61 0.03 2.40 >> >> > sdj 5.00 0.04 0.00 0.00 0.80 7.20 >> 42.00 >> >> > 0.09 1.00 2.33 0.67 2.10 0.00 0.00 0.00 >> >> 0.00 >> >> > 0.00 0.00 21.00 0.81 0.05 3.20 >> >> > >> >> > ------------------------------ >> >> > Other than this 5x node cluster, I also have a 3x node cluster with >> >> > identical hardware but it serves for a different purpose and data >> >> workload. >> >> > In this cluster I don't have any problem and MDS default settings >> seems >> >> > enough. >> >> > The only difference between two cluster is, 5x node cluster used >> directly >> >> > by users, 3x node cluster used heavily to read and write data via >> >> projects >> >> > not by users. So allocate and de-allocate will be better. >> >> > >> >> > I guess I just have a problematic use case on the 5x node cluster >> and as >> >> I >> >> > mentioned above, I might have the similar problem but I don't know >> how to >> >> > debug it. >> >> > >> >> > >> >> >> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/ >> >> > quote:"A user running VSCodium, keeping 15k caps open.. the >> opportunistic >> >> > caps recall eventually starts recalling those but the (el7 kernel) >> client >> >> > won't release them. Stopping Codium seems to be the only way to >> release." >> >> > >> >> > ------------------------------ >> >> > Before reading the osd df you should know that I created 2x >> >> > OSD/per"CT4000MX500SSD1" >> >> > # ceph osd df tree >> >> > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP >> >> META >> >> > AVAIL %USE VAR PGS STATUS TYPE NAME >> >> > -1 145.54321 - 146 TiB 45 TiB 44 TiB 119 >> GiB 333 >> >> > GiB 101 TiB 30.81 1.00 - root default >> >> > -3 29.10864 - 29 TiB 8.9 TiB 8.8 TiB 25 >> GiB 66 >> >> > GiB 20 TiB 30.54 0.99 - host ud-01 >> >> > 0 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.4 >> GiB 4.5 >> >> > GiB 1.2 TiB 33.04 1.07 61 up osd.0 >> >> > 1 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.5 >> GiB 4.0 >> >> > GiB 1.3 TiB 28.28 0.92 53 up osd.1 >> >> > 2 ssd 1.81929 1.00000 1.8 TiB 595 GiB 589 GiB 2.3 >> GiB 4.0 >> >> > GiB 1.2 TiB 31.96 1.04 63 up osd.2 >> >> > 3 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.8 >> GiB 4.2 >> >> > GiB 1.3 TiB 28.30 0.92 55 up osd.3 >> >> > 4 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.3 >> GiB 3.9 >> >> > GiB 1.3 TiB 28.21 0.92 52 up osd.4 >> >> > 5 ssd 1.81929 1.00000 1.8 TiB 592 GiB 586 GiB 1.8 >> GiB 3.8 >> >> > GiB 1.2 TiB 31.76 1.03 61 up osd.5 >> >> > 6 ssd 1.81929 1.00000 1.8 TiB 559 GiB 553 GiB 1.8 >> GiB 4.3 >> >> > GiB 1.3 TiB 30.03 0.97 57 up osd.6 >> >> > 7 ssd 1.81929 1.00000 1.8 TiB 602 GiB 597 GiB 836 >> MiB 4.4 >> >> > GiB 1.2 TiB 32.32 1.05 58 up osd.7 >> >> > 8 ssd 1.81929 1.00000 1.8 TiB 614 GiB 609 GiB 1.2 >> GiB 4.5 >> >> > GiB 1.2 TiB 32.98 1.07 60 up osd.8 >> >> > 9 ssd 1.81929 1.00000 1.8 TiB 571 GiB 565 GiB 2.2 >> GiB 4.2 >> >> > GiB 1.3 TiB 30.67 1.00 61 up osd.9 >> >> > 10 ssd 1.81929 1.00000 1.8 TiB 528 GiB 522 GiB 1.3 >> GiB 4.1 >> >> > GiB 1.3 TiB 28.33 0.92 52 up osd.10 >> >> > 11 ssd 1.81929 1.00000 1.8 TiB 551 GiB 546 GiB 1.5 >> GiB 3.6 >> >> > GiB 1.3 TiB 29.57 0.96 56 up osd.11 >> >> > 12 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.8 >> GiB 4.4 >> >> > GiB 1.2 TiB 31.91 1.04 61 up osd.12 >> >> > 13 ssd 1.81929 1.00000 1.8 TiB 561 GiB 555 GiB 1.1 >> GiB 4.3 >> >> > GiB 1.3 TiB 30.10 0.98 55 up osd.13 >> >> > 14 ssd 1.81929 1.00000 1.8 TiB 616 GiB 609 GiB 1.9 >> GiB 4.2 >> >> > GiB 1.2 TiB 33.04 1.07 64 up osd.14 >> >> > 15 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.1 >> GiB 4.0 >> >> > GiB 1.3 TiB 28.20 0.92 51 up osd.15 >> >> > -5 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 22 >> GiB 67 >> >> > GiB 20 TiB 30.89 1.00 - host ud-02 >> >> > 16 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.7 >> GiB 4.7 >> >> > GiB 1.2 TiB 33.12 1.08 63 up osd.16 >> >> > 17 ssd 1.81929 1.00000 1.8 TiB 582 GiB 577 GiB 1.6 >> GiB 4.0 >> >> > GiB 1.3 TiB 31.26 1.01 59 up osd.17 >> >> > 18 ssd 1.81929 1.00000 1.8 TiB 583 GiB 578 GiB 418 >> MiB 4.0 >> >> > GiB 1.3 TiB 31.29 1.02 54 up osd.18 >> >> > 19 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 >> GiB 4.0 >> >> > GiB 1.3 TiB 29.50 0.96 56 up osd.19 >> >> > 20 ssd 1.81929 1.00000 1.8 TiB 551 GiB 546 GiB 1.1 >> GiB 4.1 >> >> > GiB 1.3 TiB 29.57 0.96 54 up osd.20 >> >> > 21 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.3 >> GiB 4.4 >> >> > GiB 1.2 TiB 33.04 1.07 60 up osd.21 >> >> > 22 ssd 1.81929 1.00000 1.8 TiB 573 GiB 567 GiB 1.6 >> GiB 4.1 >> >> > GiB 1.3 TiB 30.75 1.00 58 up osd.22 >> >> > 23 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.3 >> GiB 4.3 >> >> > GiB 1.2 TiB 33.06 1.07 60 up osd.23 >> >> > 24 ssd 1.81929 1.00000 1.8 TiB 539 GiB 534 GiB 844 >> MiB 3.8 >> >> > GiB 1.3 TiB 28.92 0.94 51 up osd.24 >> >> > 25 ssd 1.81929 1.00000 1.8 TiB 583 GiB 576 GiB 2.1 >> GiB 4.1 >> >> > GiB 1.3 TiB 31.27 1.02 61 up osd.25 >> >> > 26 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.3 >> GiB 4.6 >> >> > GiB 1.2 TiB 33.12 1.08 61 up osd.26 >> >> > 27 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 1.2 >> GiB 4.1 >> >> > GiB 1.3 TiB 28.84 0.94 53 up osd.27 >> >> > 28 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.3 >> GiB 4.2 >> >> > GiB 1.3 TiB 28.29 0.92 53 up osd.28 >> >> > 29 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.5 >> GiB 4.6 >> >> > GiB 1.2 TiB 31.91 1.04 59 up osd.29 >> >> > 30 ssd 1.81929 1.00000 1.8 TiB 528 GiB 523 GiB 1.4 >> GiB 4.1 >> >> > GiB 1.3 TiB 28.35 0.92 53 up osd.30 >> >> > 31 ssd 1.81929 1.00000 1.8 TiB 594 GiB 589 GiB 1.6 >> GiB 3.8 >> >> > GiB 1.2 TiB 31.89 1.03 61 up osd.31 >> >> > -7 29.10864 - 29 TiB 8.9 TiB 8.8 TiB 23 >> GiB 67 >> >> > GiB 20 TiB 30.66 1.00 - host ud-03 >> >> > 32 ssd 1.81929 1.00000 1.8 TiB 593 GiB 588 GiB 1.1 >> GiB 4.3 >> >> > GiB 1.2 TiB 31.84 1.03 57 up osd.32 >> >> > 33 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.8 >> GiB 4.4 >> >> > GiB 1.2 TiB 33.13 1.08 63 up osd.33 >> >> > 34 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 2.0 >> GiB 3.8 >> >> > GiB 1.3 TiB 28.84 0.94 59 up osd.34 >> >> > 35 ssd 1.81929 1.00000 1.8 TiB 562 GiB 556 GiB 1.7 >> GiB 4.2 >> >> > GiB 1.3 TiB 30.16 0.98 58 up osd.35 >> >> > 36 ssd 1.81929 1.00000 1.8 TiB 529 GiB 523 GiB 1.3 >> GiB 3.9 >> >> > GiB 1.3 TiB 28.38 0.92 52 up osd.36 >> >> > 37 ssd 1.81929 1.00000 1.8 TiB 527 GiB 521 GiB 1.7 >> GiB 4.2 >> >> > GiB 1.3 TiB 28.28 0.92 55 up osd.37 >> >> > 38 ssd 1.81929 1.00000 1.8 TiB 574 GiB 568 GiB 1.2 >> GiB 4.3 >> >> > GiB 1.3 TiB 30.79 1.00 55 up osd.38 >> >> > 39 ssd 1.81929 1.00000 1.8 TiB 605 GiB 599 GiB 1.6 >> GiB 4.2 >> >> > GiB 1.2 TiB 32.48 1.05 61 up osd.39 >> >> > 40 ssd 1.81929 1.00000 1.8 TiB 573 GiB 567 GiB 1.2 >> GiB 4.4 >> >> > GiB 1.3 TiB 30.76 1.00 56 up osd.40 >> >> > 41 ssd 1.81929 1.00000 1.8 TiB 526 GiB 520 GiB 1.7 >> GiB 3.9 >> >> > GiB 1.3 TiB 28.21 0.92 54 up osd.41 >> >> > 42 ssd 1.81929 1.00000 1.8 TiB 613 GiB 608 GiB 1010 >> MiB 4.4 >> >> > GiB 1.2 TiB 32.91 1.07 58 up osd.42 >> >> > 43 ssd 1.81929 1.00000 1.8 TiB 606 GiB 600 GiB 1.7 >> GiB 4.3 >> >> > GiB 1.2 TiB 32.51 1.06 61 up osd.43 >> >> > 44 ssd 1.81929 1.00000 1.8 TiB 583 GiB 577 GiB 1.6 >> GiB 4.2 >> >> > GiB 1.3 TiB 31.29 1.02 60 up osd.44 >> >> > 45 ssd 1.81929 1.00000 1.8 TiB 618 GiB 613 GiB 1.4 >> GiB 4.3 >> >> > GiB 1.2 TiB 33.18 1.08 62 up osd.45 >> >> > 46 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 >> GiB 4.2 >> >> > GiB 1.3 TiB 29.50 0.96 54 up osd.46 >> >> > 47 ssd 1.81929 1.00000 1.8 TiB 526 GiB 522 GiB 692 >> MiB 3.7 >> >> > GiB 1.3 TiB 28.25 0.92 50 up osd.47 >> >> > -9 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 26 >> GiB 68 >> >> > GiB 20 TiB 31.04 1.01 - host ud-04 >> >> > 48 ssd 1.81929 1.00000 1.8 TiB 540 GiB 534 GiB 2.2 >> GiB 3.6 >> >> > GiB 1.3 TiB 28.96 0.94 58 up osd.48 >> >> > 49 ssd 1.81929 1.00000 1.8 TiB 617 GiB 611 GiB 1.4 >> GiB 4.5 >> >> > GiB 1.2 TiB 33.11 1.07 61 up osd.49 >> >> > 50 ssd 1.81929 1.00000 1.8 TiB 618 GiB 612 GiB 1.2 >> GiB 4.8 >> >> > GiB 1.2 TiB 33.17 1.08 61 up osd.50 >> >> > 51 ssd 1.81929 1.00000 1.8 TiB 618 GiB 612 GiB 1.5 >> GiB 4.5 >> >> > GiB 1.2 TiB 33.19 1.08 61 up osd.51 >> >> > 52 ssd 1.81929 1.00000 1.8 TiB 526 GiB 521 GiB 1.4 >> GiB 4.1 >> >> > GiB 1.3 TiB 28.25 0.92 53 up osd.52 >> >> > 53 ssd 1.81929 1.00000 1.8 TiB 618 GiB 611 GiB 2.4 >> GiB 4.3 >> >> > GiB 1.2 TiB 33.17 1.08 66 up osd.53 >> >> > 54 ssd 1.81929 1.00000 1.8 TiB 550 GiB 544 GiB 1.5 >> GiB 4.3 >> >> > GiB 1.3 TiB 29.54 0.96 55 up osd.54 >> >> > 55 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.3 >> GiB 4.0 >> >> > GiB 1.3 TiB 28.29 0.92 52 up osd.55 >> >> > 56 ssd 1.81929 1.00000 1.8 TiB 525 GiB 519 GiB 1.2 >> GiB 4.1 >> >> > GiB 1.3 TiB 28.16 0.91 52 up osd.56 >> >> > 57 ssd 1.81929 1.00000 1.8 TiB 615 GiB 609 GiB 2.3 >> GiB 4.2 >> >> > GiB 1.2 TiB 33.03 1.07 65 up osd.57 >> >> > 58 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.6 >> GiB 3.7 >> >> > GiB 1.3 TiB 28.31 0.92 55 up osd.58 >> >> > 59 ssd 1.81929 1.00000 1.8 TiB 615 GiB 609 GiB 1.2 >> GiB 4.6 >> >> > GiB 1.2 TiB 33.01 1.07 60 up osd.59 >> >> > 60 ssd 1.81929 1.00000 1.8 TiB 594 GiB 588 GiB 1.2 >> GiB 4.4 >> >> > GiB 1.2 TiB 31.88 1.03 59 up osd.60 >> >> > 61 ssd 1.81929 1.00000 1.8 TiB 616 GiB 610 GiB 1.9 >> GiB 4.1 >> >> > GiB 1.2 TiB 33.04 1.07 64 up osd.61 >> >> > 62 ssd 1.81929 1.00000 1.8 TiB 620 GiB 614 GiB 1.9 >> GiB 4.4 >> >> > GiB 1.2 TiB 33.27 1.08 63 up osd.62 >> >> > 63 ssd 1.81929 1.00000 1.8 TiB 527 GiB 522 GiB 1.5 >> GiB 4.0 >> >> > GiB 1.3 TiB 28.30 0.92 53 up osd.63 >> >> > -11 29.10864 - 29 TiB 9.0 TiB 8.9 TiB 23 >> GiB 65 >> >> > GiB 20 TiB 30.91 1.00 - host ud-05 >> >> > 64 ssd 1.81929 1.00000 1.8 TiB 608 GiB 601 GiB 2.3 >> GiB 4.5 >> >> > GiB 1.2 TiB 32.62 1.06 65 up osd.64 >> >> > 65 ssd 1.81929 1.00000 1.8 TiB 606 GiB 601 GiB 628 >> MiB 4.2 >> >> > GiB 1.2 TiB 32.53 1.06 57 up osd.65 >> >> > 66 ssd 1.81929 1.00000 1.8 TiB 583 GiB 578 GiB 1.3 >> GiB 4.3 >> >> > GiB 1.2 TiB 31.31 1.02 57 up osd.66 >> >> > 67 ssd 1.81929 1.00000 1.8 TiB 537 GiB 533 GiB 436 >> MiB 3.6 >> >> > GiB 1.3 TiB 28.82 0.94 50 up osd.67 >> >> > 68 ssd 1.81929 1.00000 1.8 TiB 541 GiB 535 GiB 2.5 >> GiB 3.8 >> >> > GiB 1.3 TiB 29.04 0.94 59 up osd.68 >> >> > 69 ssd 1.81929 1.00000 1.8 TiB 606 GiB 601 GiB 1.1 >> GiB 4.4 >> >> > GiB 1.2 TiB 32.55 1.06 59 up osd.69 >> >> > 70 ssd 1.81929 1.00000 1.8 TiB 604 GiB 598 GiB 1.8 >> GiB 4.1 >> >> > GiB 1.2 TiB 32.44 1.05 63 up osd.70 >> >> > 71 ssd 1.81929 1.00000 1.8 TiB 606 GiB 600 GiB 1.9 >> GiB 4.5 >> >> > GiB 1.2 TiB 32.53 1.06 62 up osd.71 >> >> > 72 ssd 1.81929 1.00000 1.8 TiB 602 GiB 598 GiB 612 >> MiB 4.1 >> >> > GiB 1.2 TiB 32.33 1.05 57 up osd.72 >> >> > 73 ssd 1.81929 1.00000 1.8 TiB 571 GiB 565 GiB 1.8 >> GiB 4.5 >> >> > GiB 1.3 TiB 30.65 0.99 58 up osd.73 >> >> > 74 ssd 1.81929 1.00000 1.8 TiB 608 GiB 602 GiB 1.8 >> GiB 4.2 >> >> > GiB 1.2 TiB 32.62 1.06 61 up osd.74 >> >> > 75 ssd 1.81929 1.00000 1.8 TiB 536 GiB 531 GiB 1.9 >> GiB 3.5 >> >> > GiB 1.3 TiB 28.80 0.93 57 up osd.75 >> >> > 76 ssd 1.81929 1.00000 1.8 TiB 605 GiB 599 GiB 1.4 >> GiB 4.5 >> >> > GiB 1.2 TiB 32.48 1.05 60 up osd.76 >> >> > 77 ssd 1.81929 1.00000 1.8 TiB 537 GiB 532 GiB 1.2 >> GiB 3.9 >> >> > GiB 1.3 TiB 28.84 0.94 52 up osd.77 >> >> > 78 ssd 1.81929 1.00000 1.8 TiB 525 GiB 520 GiB 1.3 >> GiB 3.8 >> >> > GiB 1.3 TiB 28.20 0.92 52 up osd.78 >> >> > 79 ssd 1.81929 1.00000 1.8 TiB 536 GiB 531 GiB 1.1 >> GiB 3.3 >> >> > GiB 1.3 TiB 28.76 0.93 53 up osd.79 >> >> > TOTAL 146 TiB 45 TiB 44 TiB 119 >> GiB 333 >> >> > GiB 101 TiB 30.81 >> >> > MIN/MAX VAR: 0.91/1.08 STDDEV: 1.90 >> >> > >> >> > >> >> > >> >> > Eugen Block <eblock@xxxxxx>, 25 Oca 2024 Per, 16:52 tarihinde şunu >> >> yazdı: >> >> > >> >> >> There is no definitive answer wrt mds tuning. As it is everywhere >> >> >> mentioned, it's about finding the right setup for your specific >> >> >> workload. If you can synthesize your workload (maybe scale down a >> bit) >> >> >> try optimizing it in a test cluster without interrupting your >> >> >> developers too much. >> >> >> But what you haven't explained yet is what are you experiencing as a >> >> >> performance issue? Do you have numbers or a detailed description? >> >> >> From the fs status output you didn't seem to have too much activity >> >> >> going on (around 140 requests per second), but that's probably not >> the >> >> >> usual traffic? What does ceph report in its client IO output? >> >> >> Can you paste the 'ceph osd df' output as well? >> >> >> Do you have dedicated MDS servers or are they colocated with other >> >> >> services? >> >> >> >> >> >> Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>: >> >> >> >> >> >> > Hello Eugen. >> >> >> > >> >> >> > I read all of your MDS related topics and thank you so much for >> your >> >> >> effort >> >> >> > on this. >> >> >> > There is not much information and I couldn't find a MDS tuning >> guide >> >> at >> >> >> > all. It seems that you are the correct person to discuss mds >> >> debugging >> >> >> and >> >> >> > tuning. >> >> >> > >> >> >> > Do you have any documents or may I learn what is the proper way to >> >> debug >> >> >> > MDS and clients ? >> >> >> > Which debug logs will guide me to understand the limitations and >> will >> >> >> help >> >> >> > to tune according to the data flow? >> >> >> > >> >> >> > While searching, I find this: >> >> >> > >> >> >> >> >> >> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/ >> >> >> > quote:"A user running VSCodium, keeping 15k caps open.. the >> >> opportunistic >> >> >> > caps recall eventually starts recalling those but the (el7 kernel) >> >> client >> >> >> > won't release them. Stopping Codium seems to be the only way to >> >> release." >> >> >> > >> >> >> > Because of this I think I also need to play around with the client >> >> side >> >> >> too. >> >> >> > >> >> >> > My main goal is increasing the speed and reducing the latency and >> I >> >> >> wonder >> >> >> > if these ideas are correct or not: >> >> >> > - Maybe I need to increase client side cache size because via each >> >> >> client, >> >> >> > multiple users request a lot of objects and clearly the >> >> >> > client_cache_size=16 default is not enough. >> >> >> > - Maybe I need to increase client side maximum cache limit for >> >> >> > object "client_oc_max_objects=1000 to 10000" and data >> >> >> "client_oc_size=200mi >> >> >> > to 400mi" >> >> >> > - The client cache cleaning threshold is not aggressive enough to >> keep >> >> >> the >> >> >> > free cache size in the desired range. I need to make it >> aggressive but >> >> >> this >> >> >> > should not reduce speed and increase latency. >> >> >> > >> >> >> > mds_cache_memory_limit=4gi to 16gi >> >> >> > client_oc_max_objects=1000 to 10000 >> >> >> > client_oc_size=200mi to 400mi >> >> >> > client_permissions=false #to reduce latency. >> >> >> > client_cache_size=16 to 128 >> >> >> > >> >> >> > >> >> >> > What do you think? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx