How to Speed Up Draining OSDs?

"Alex Hussein-Kershaw (HE/HIM)" <alexhus@xxxxxxxxxxxxx> · Mon, 21 Oct 2024 13:08:36 +0000

Hi Folks,

I'm trying to scale-in a Ceph Cluster. It's running 19.2.0 and is cephadm managed. It's just a test system, so has basically no data and only has 3 OSDs.

As part of the scaling-in, I run "ceph orch host drain <hostname> --zap-osd-devices" as per Host Management — Ceph Documentation<https://docs.ceph.com/en/reef/cephadm/host-management/#removing-hosts>. That starts off the OSD draining.

However, that drain seems to take an enormous amount of time. My OSD has less than 100MiB raw storage, and I let it run for 2 hours over lunch and it still was not finished, so I cancelled it.

I'm not sure how this scales, but I'm assuming at least linearly with data stored, which seems like bad news for doing this on real systems, which may have several TBs per OSD.

I had a look at the recovery profiles documentation here mClock Config Reference — Ceph Documentation<https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/> which seemed to indicate I could speed this up (but my impression was maybe I could get a speed up of 2x which seems like it will still take an age).

On the other hand, just switching off the host running the OSD and doing an offline host removal ("ceph orch host rm <hostname> --offline") seems much easier, with the trade-off that the Cluster recovers after the loss of the OSD rather than pre-emptively. But that big risk of that seems to be mitigated by "ceph orch host ok-to-stop <hostname>" to check I won't cause any PGs to go offline before hand.

Are there any tricks here that I'm missing?

Thanks,
Alex

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx