Thank you for the advice.
Our crush map is actually setup with replication set to 3, and at least one copy in each cabinet, ensuring no one host is a single point of failure. We fully intended on performing this maintenance over the course of many week, one host at a time. We felt that the staggered deploy times for the SSDs, based on their unique failure nature, was a benefit anyway. (i.e. When one goes, all of its friends are usually close behind)
Cheers,
-Stephen Mercier | Sr. Systems Architect Attainia Capital Planning Solutions (ACPS) O: (650)241-0567, 727 | TF: (866)288-2464, 727
On Apr 14, 2016, at 7:00 AM, Wido den Hollander wrote: Op 14 april 2016 om 15:29 schreef Stephen Mercier
<stephen.mercier@xxxxxxxxxxxx>:
Good morning,
We've been running a medium-sized (88 OSDs - all SSD) ceph cluster for the
past 20 months. We're very happy with our experience with the platform so far.
Shortly, we will be embarking on an initiative to replace all 88 OSDs with new
drives (Planned maintenance and lifecycle replacement). Before we do so,
however, I wanted to confirm with the community as to the proper order of
operation to perform such a task.
The OSDs are divided evenly across an even number of hosts which are then
divided evenly between 2 cabinets in 2 physically separate locations. The plan
is to replace the OSDs, one host at a time, cycling back and forth between
cabinets, replacing one host per week, or every 2 weeks (Depending on the
amount of time the crush rebalancing takes).
I assume that your replication is set to "2" and that you replicate over the two locations? In that case, only work on HDDs in the first location and start on the second one after you replaced them all. For each host, the plan was to mark the OSDs as out, one at a time, closely
monitoring each of them, moving to the next OSD one the current one is
balanced out. Once all OSDs are successfully marked as out, we will then
delete those OSDs from the cluster, shutdown the server, replace the physical
drives, and once rebooted, add the new drives to the cluster as new OSDs using
the same method we've used previously, doing so one at a time to allow for
rebalancing as they rejoin the cluster.
My questions are…Does this process sound correct? Should I also mark the OSDs
as down when I mark them as out? Are there any steps I'm overlooking in this
process?
No, marking out is just fine. That tells CRUSH the OSD is no longer participating in the data placement. It's effective weight will be 0 and that's it. Like others mention, reweight the OSD to 0 at the same time you mark it as out. That way you prevent a double rebalance. Keep it marked as UP so that it can help in migrating the PGs to other nodes. Any advice is greatly appreciated.
Cheers,
-
Stephen Mercier | Sr. Systems Architect
Attainia Capital Planning Solutions (ACPS)
O: (650)241-0567, 727 | TF: (866)288-2464, 727
stephen.mercier@xxxxxxxxxxxx | www.attainia.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|