Yes, you can override the capacity using "config set osd.N osd_mclock_max_capacity_iops_ssd <new_value>". On Tue, Oct 1, 2024 at 3:45 PM Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> wrote: > Digged a bit further, seems like the osd_mclock_max_capacity_iops_ssd in > config db which comes from ceph bench determined by 1 osd, however if I > have 4osd on my 15TB nvme and I run the bench in parallel on 1 nvme drive 4 > osds, the result is /4. > > Is it safe to divide this value by 4 in the config db? > > ________________________________ > From: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> > Sent: Tuesday, October 1, 2024 1:47 PM > To: Ceph Users <ceph-users@xxxxxxx> > Subject: Is there a way to throttle faster osds due to slow > ops? > > Hi, > > We have extended our clusters with some new nodes and currently it is > impossible to remove from any old node the nvme drive which holding the > index pool in the cluster without generating slow ops and cluster > performance degradation. > > Currently how I want to remove is in quincy non cephadm cluster is to > crush reweight to 0 and remove. This data movement makes slow ops all the > way during the nvme osd out. > > In my opinion it might be generated from the faster drives harder push on > the old servers nvme which makes high iowait on the old nvmes so I want to > somehow throttle the new nvmes. Not sure with mclock or with any wait is it > possible? (max backfill, osd recover ops and recovery ops priotity is > already 1 and balancer max misplaced ratio 0.01). > > This some of the slow osd says during remove: > https://gist.github.com/Badb0yBadb0y/15b51e524a47dfbd2728bbabc18238fc#file-gistfile1-txt > > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 5.583707809s, txc = 0x55af2bd2e300 > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 5.541916847s, txc = 0x55af1a035b00 > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 5.533919334s, txc = 0x55af19fafb00 > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 6.904534340s, txc = 0x55af49814c00 > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 6.911001205s, txc = 0x55af24b19800 > 2024-10-01T11:46:29.601+0700 7f29bf4f8640 0 > bluestore(/var/lib/ceph/osd/ceph-91) log_latency_fn slow operation observed > for _txc_committed_kv, latency = 5.597061634s, txc = 0x55af4fe0fb00 > 2024-10-01T11:46:30.889+0700 7f29becf7640 4 rocksdb: > [db/db_impl/db_impl_write.cc:1736] [default] New memtable created with log > file: #280327. Immutable memtables: 0. > 2024-10-01T11:46:30.889+0700 7f29becf7640 4 rocksdb: > [db/column_family.cc:983] [default] Increasing compaction threads because > we have 18 level-0 files > 2024-10-01T11:46:30.889+0700 7f29c4512640 4 rocksdb: (Original Log Time > 2024/10/01-11:46:30.893378) [db/db_impl/db_impl_compaction_flush.cc:2394] > Calling FlushMemTableToOutputFile with column family [default], flush slots > available 1, compaction slots available 2, flush slots scheduled 1, > compaction slots scheduled 2 > 2024-10-01T11:46:30.889+0700 7f29c4512640 4 rocksdb: > [db/flush_job.cc:335] [default] [JOB 5604] Flushing memtable with next log > file: 280327 > 2024-10-01T11:46:30.889+0700 7f29c4512640 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1727757990893428, "job": 5604, "event": "flush_started", > "num_memtables": 1, "num_entries": 2437269, "num_deletes": 2384624, > "total_data_size": 233695787, "memory_usage": 278437952, "flush_reason": > "Write Buffer Full"} > 2024-10-01T11:46:30.889+0700 7f29c4512640 4 rocksdb: > [db/flush_job.cc:364] [default] [JOB 5604] Level-0 flush table #280328: > started > > Thank you > > ________________________________ > This message is confidential and is for the sole use of the intended > recipient(s). It may also be privileged or otherwise protected by copyright > or other legal rules. If you have received it by mistake please let us know > by reply email and delete it from your system. It is prohibited to copy > this message or disclose its content to anyone. Any confidentiality or > privilege is not waived or lost by any mistaken delivery or unauthorized > disclosure of the message. All messages sent to and from Agoda may be > monitored to ensure compliance with company policies, to protect the > company's interests and to remove potential malware. Electronic messages > may be intercepted, amended, lost or deleted, or contain viruses. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > -- Sridhar Seshasayee Partner Engineer Red Hat <https://www.redhat.com> <https://www.redhat.com> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx