Antw: Advice on OSD upgrades

"Steffen Weißgerber" <WeissgerberS@xxxxxxx> · Thu, 14 Apr 2016 15:56:37 +0200

Hi,

that's how I did it for my osd's 25 to 30 (you can add as much as osd
numbers you like as long
you have free space).

First you can reweight the osd's to 0 to move their copies to other
osd's

for i in {25..30};
do
  ceph osd crush reweight osd.$i
done

and have to wait until it's done (when cluster health is ok again). 

Then you can remove the osd's from the cluster:

for i in {25..30};
do
  ceph osd out osd.$i && stop ceph-osd id=$i && ceph osd crush remove
osd.$i && ceph auth del osd.$i && ceph osd rm osd.$i;
done

Then you can remove the disks from the system:

echo 1 > /sys/block/sd<disk>/device/delete

where sd<disk> is the scsi-device name for the osd's (you can find from
/proc/partitions).

Then you can remove the disk physically (if hotplug is available).

After inserting new disks create the new osd's with ceph-deploy.

Regards

Steffen

>>> Stephen Mercier <stephen.mercier@xxxxxxxxxxxx> schrieb am
Donnerstag, 14. April
2016 um 15:29:
> Good morning,
> 
> We've been running a medium-sized (88 OSDs - all SSD) ceph cluster
for the 
> past 20 months. We're very happy with our experience with the
platform so 
> far.
> 
> Shortly, we will be embarking on an initiative to replace all 88 OSDs
with 
> new drives (Planned maintenance and lifecycle replacement). Before we
do so, 
> however, I wanted to confirm with the community as to the proper
order of 
> operation to perform such a task.
> 
> The OSDs are divided evenly across an even number of hosts which are
then 
> divided evenly between 2 cabinets in 2 physically separate locations.
The 
> plan is to replace the OSDs, one host at a time, cycling back and
forth 
> between cabinets, replacing one host per week, or every 2 weeks
(Depending on 
> the amount of time the crush rebalancing takes).
> 
> For each host, the plan was to mark the OSDs as out, one at a time,
closely 
> monitoring each of them, moving to the next OSD one the current one
is 
> balanced out. Once all OSDs are successfully marked as out, we will
then 
> delete those OSDs from the cluster, shutdown the server, replace the
physical 
> drives, and once rebooted, add the new drives to the cluster as new
OSDs 
> using the same method we've used previously, doing so one at a time
to allow 
> for rebalancing as they rejoin the cluster.
> 
> My questions are*Does this process sound correct? Should I also mark
the 
> OSDs as down when I mark them as out? Are there any steps I'm
overlooking in 
> this process?
> 
> Any advice is greatly appreciated.
> 
> Cheers,
> -
> Stephen Mercier | Sr. Systems Architect
> Attainia Capital Planning Solutions (ACPS)
> O: (650)241-0567, 727 | TF: (866)288-2464, 727
> stephen.mercier@xxxxxxxxxxxx | www.attainia.com

-- 
Klinik-Service Neubrandenburg GmbH
Allendestr. 30, 17036 Neubrandenburg
Amtsgericht Neubrandenburg, HRB 2457
Geschaeftsfuehrerin: Gudrun Kappich
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com