On 25/02/2021 11:19, Dylan McCulloch wrote: > Simon Oosthoek wrote: >> On 24/02/2021 22:28, Patrick Donnelly wrote: >> > Hello Simon, >> > >> > On Wed, Feb 24, 2021 at 7:43 AM Simon Oosthoek > <s.oosthoek(a)science.ru.nl> wrote: >> > >> > On 24/02/2021 12:40, Simon Oosthoek wrote: >> > Hi >> > >> > we've been running our Ceph cluster for nearly 2 years now (Nautilus) >> > and recently, due to a temporary situation the cluster is at 80% full. >> > >> > We are only using CephFS on the cluster. >> > >> > Normally, I realize we should be adding OSD nodes, but this is a >> > temporary situation, and I expect the cluster to go to <60% full > quite soon. >> > >> > Anyway, we are noticing some really problematic slowdowns. There are >> > some things that could be related but we are unsure... >> > >> > - Our 2 MDS nodes (1 active, 1 standby) are configured with 128GB RAM, >> > but are not using more than 2GB, this looks either very inefficient, or >> > wrong ;-) >> > After looking at our monitoring history, it seems the mds cache is >> > actually used more fully, but most of our servers are getting a weekly >> > reboot by default. This clears the mds cache obviously. I wonder if >> > that's a smart idea for an MDS node...? ;-) >> > No, it's not. Can you also check that you do not have mds_cache_size >> > configured, perhaps on the MDS local ceph.conf? >> > >> Hi Patrick, >> >> I've already changed the reboot period to 1 month. >> >> The mds_cache_size is not configured locally in the /etc/ceph/ceph.conf >> file, so I guess it's just the weekly reboot that cleared the memory of >> cache data... >> >> I'm starting to think that a full ceph cluster could probably be the >> only cause of performance problems. Though I don't know why that would be. > > Did the performance issue only arise when OSDs in the cluster reached > 80% usage? What is your osd nearfull_ratio? > $ ceph osd dump | grep ratio full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 > Is the cluster in HEALTH_WARN with nearfull OSDs? ]# ceph -s cluster: id: b489547c-ba50-4745-a914-23eb78e0e5dc health: HEALTH_WARN 2 pgs not deep-scrubbed in time 957 pgs not scrubbed in time services: mon: 3 daemons, quorum cephmon3,cephmon1,cephmon2 (age 7d) mgr: cephmon3(active, since 2M), standbys: cephmon1, cephmon2 mds: cephfs:1 {0=cephmds2=up:active} 1 up:standby osd: 168 osds: 168 up (since 11w), 168 in (since 9M); 43 remapped pgs task status: scrub status: mds.cephmds2: idle data: pools: 10 pools, 5280 pgs objects: 587.71M objects, 804 TiB usage: 1.4 PiB used, 396 TiB / 1.8 PiB avail pgs: 9634168/5101965463 objects misplaced (0.189%) 5232 active+clean 29 active+remapped+backfill_wait 14 active+remapped+backfilling 5 active+clean+scrubbing+deep+repair io: client: 136 MiB/s rd, 600 MiB/s wr, 386 op/s rd, 359 op/s wr recovery: 328 MiB/s, 169 objects/s > We noticed recently when one of our clusters had nearfull OSDs that > cephfs client performance was heavily impacted. > Our cluster is nautilus 14.2.15. Clients are kernel 4.19.154. > We determined that it was most likely due to the ceph client forcing > sync file writes when nearfull flag is present. > https://github.com/ceph/ceph-client/commit/7614209736fbc4927584d4387faade4f31444fce > Increasing and decreasing the nearfull ratio confirmed that performance > was only impacted while the nearfull flag was present. > Not sure if that's relevant for your case. I think this could be very similar in our cluster, thanks for sharing your insights! Cheers /Simon _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx