Re: Best practice with 0.48.2 to take a node into maintenance

Oliver Francke <Oliver.Francke@xxxxxxxx> · Mon, 3 Dec 2012 21:13:09 +0100

Hi Florian,

Am 03.12.2012 um 20:45 schrieb Smart Weblications GmbH - Florian Wiessner  <f.wiessner@xxxxxxxxxxxxxxxxxxxxx>:

> Am 03.12.2012 20:21, schrieb Oliver Francke:
>> Hi Josh,
>> 
>> Am 03.12.2012 um 20:14 schrieb Josh Durgin <josh.durgin@xxxxxxxxxxx>:
>> 
>>> On 12/03/2012 11:05 AM, Oliver Francke wrote:
>>>> Hi *,
>>>> 
>>>> well, even if 0.48.2 is really stable and reliable, it is not everytime the case with linux kernel. We have a couple of nodes, where an update would make life better.
>>>> So, as our OSD-nodes have to care for VM's too, it's not the problem to let them drain so migrate all of them to other nodes.
>>>> Just reboot? Perhaps not, cause all OSD's will begin to remap/backfill, they are instructed to do so. Well, declare them as "osd lost"?
>>>> Dangerous. Is there another way I miss in doing node-maintenance? Will we have to wait for bobtail for far less hassle with all remapping and resources?
>>> 
>>> By default the monitors won't mark an OSD out in the time it takes to
>>> reboot, but if maintenance takes longer, you can drain data from the
>>> node.
>>> 
>>> A simple way to rate limit it yourself is by slowly lowering the
>>> weights of the OSDs on the host you want to update, e.g. by 0.1 at a
>>> time and waiting for recovery to complete before lowering again. Once
>>> they're at 0 and the cluster is healthy, they're not responsible for
>>> any data anymore, and the node can be rebooted.
>>> 
>> 
>> true. Should have mentioned knowing smooth way. But for a planned reboot this take way too much time ;)
>> But if it's recommended, it's recommended ;)
>> 
> 
> 
> I did rolling reboots of our whole cluster a few days ago (3.4.20). When the
> system reboots and no fsck is done, ceph won't start to backfill in my setup.
> 
> I had some nodes do fsck after upgrade so ceph marked the osd as down and
> started to backfill, but once the missing osd was back up running again, the
> backfill stopped and ceph did just a little bit of peering and was healthy in a
> few minutes again (2-5 minutes)…
> 

if you encounter all BIOS-, POST-, RAID-controller-checks, linux-boot, openvswitch-STP setup and so on, one can imagine, that a reboot takes a "couple-of-minutes", normally with our setup after 30 seconds the cluster shall detect some outage and start to do it's work.
Everytings fine, but perhaps we could avoid big load in the cluster to remap and re-remap ( "Theme: slow requests") I have to ask in means of QoS for a "better way" ;)
All that stuff had a big customer impact in the past… Time to ask.

Kind reg's

Oliver.

> 
> 
> 
> -- 
> 
> Mit freundlichen Grüßen,
> 
> Florian Wiessner
> 
> Smart Weblications GmbH
> Martinsberger Str. 1
> D-95119 Naila
> 
> fon.: +49 9282 9638 200
> fax.: +49 9282 9638 205
> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> http://www.smart-weblications.de
> 
> --
> Sitz der Gesellschaft: Naila
> Geschäftsführer: Florian Wiessner
> HRB-Nr.: HRB 3840 Amtsgericht Hof
> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html