Re: Best practice with 0.48.2 to take a node into maintenance

Josh Durgin <josh.durgin@xxxxxxxxxxx> · Mon, 03 Dec 2012 11:14:55 -0800

On 12/03/2012 11:05 AM, Oliver Francke wrote:
Hi *,

well, even if 0.48.2 is really stable and reliable, it is not everytime the case with linux kernel. We have a couple of nodes, where an update would make life better.
So, as our OSD-nodes have to care for VM's too, it's not the problem to let them drain so migrate all of them to other nodes.
Just reboot? Perhaps not, cause all OSD's will begin to remap/backfill, they are instructed to do so. Well, declare them as "osd lost"?
Dangerous. Is there another way I miss in doing node-maintenance? Will we have to wait for bobtail for far less hassle with all remapping and resources?

By default the monitors won't mark an OSD out in the time it takes to
reboot, but if maintenance takes longer, you can drain data from the
node.

A simple way to rate limit it yourself is by slowly lowering the
weights of the OSDs on the host you want to update, e.g. by 0.1 at a
time and waiting for recovery to complete before lowering again. Once
they're at 0 and the cluster is healthy, they're not responsible for
any data anymore, and the node can be rebooted.

Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html