Cephadm Drive upgrade process

<brentk@xxxxxxxxxx> · Mon, 11 Nov 2024 17:47:12 -0500

We are contemplating an upgrade of 4TB HDD drives to 20TB HDD drives
(cluster info below, size 3 ), but as part of that discussion, we were
trying to see if there was a more efficient way to do so.  Our current
process is as follows for failed drives:

1.	Pulled failed drive ( after troubleshooting of course )
2.	Cephadm gui - find OSD, purge osd
3.	Wait for rebalance
4.	Insert new drive ( let cluster rebalance after it automatically adds
the drive as an OSD ) ( yes, we have auto-add on in the clusters )

I imagine with an existing good drive, we would use delete instead of purge,
but the process would be the similar, except the drive swap would happen
after the data was moved.  Would the replace flag ( or keep OSD option in
gui ) allow us to avoid the initial rebalance by backfilling the new drive
with the old drives content?  I would be nice if we could just copy the
content to the new drive and go from there.  We would like to avoid lots of
read/write cluster recovery activity if possible sine we could be replacing
40+ drives in each cluster.

Open to trying a few methods of course, but we thought the group here might
have some insight into the most optimal way of doing these swaps.  For a
history, last time we did this, we just drained an entire host, upgraded the
drives and then added it back.  This caused temporary imbalance that we
would like to avoid this time around.

Regards,

-Brent

Existing Clusters:

Test: Reef 18.2.0 ( all virtual on nvme )

US Production(HDD): Reef 18.2.4 Cephadm with 11 osd servers, 5 mons, 4 rgw,
2 iscsigw, 2 mds

UK Production(HDD): Reef 18.2.4 Cephadm with 20 osd servers, 5 mons, 4 rgw,
2 iscsigw, 2mds

US Production(SSD): Reef 18.2.4 Cephadm with 6 osd servers, 5 mons, 4 rgw, 2
mds

UK Production(SSD): Reef 18.2.4 cephadm with 6 osd servers, 5 mons, 4 rgw, 2
mds

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx