Re: Adding a delay when restarting all OSDs on a host

Andrey Korolyov <andrey@xxxxxxx> · Tue, 22 Jul 2014 18:58:33 +0400

On Tue, Jul 22, 2014 at 6:28 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
> On 07/22/2014 03:48 PM, Andrey Korolyov wrote:
>>
>> On Tue, Jul 22, 2014 at 5:19 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> Currently on Ubuntu with Upstart when you invoke a restart like this:
>>>
>>> $ sudo restart ceph-osd-all
>>>
>>> It will restart all OSDs at once, which can increase the load on the
>>> system
>>> a quite a bit.
>>>
>>> It's better to restart all OSDs by restarting them one by one:
>>>
>>> $ sudo ceph restart ceph-osd id=X
>>>
>>> But you then have to figure out all the IDs by doing a find in
>>> /var/lib/ceph/osd and that's more manual work.
>>>
>>> I'm thinking of patching the init scripts which allows something like
>>> this:
>>>
>>> $ sudo restart ceph-osd-all delay=180
>>>
>>> It then waits 180 seconds between each OSD restart making the proces even
>>> smoother.
>>>
>>> I know there are currently sysvinit, upstart and systemd scripts, so it
>>> has
>>> to be implemented on various places, but how does the general idea sound?
>>>
>>> --
>>> Wido den Hollander
>>> Ceph consultant and trainer
>>> 42on B.V.
>>>
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> --
>>
>>
>>
>> Hi,
>>
>> this behaviour obviously have a negative side of increased overall
>> peering time and larger integral value of out-of-SLA delays. I`d vote
>> for warming up necessary files, most likely collections, just before
>> restart. If there are no enough room to hold all of them at once, we
>> can probably combine both methods to achieve lower impact value on
>> restart, although adding a simple delay sounds much more straight than
>> putting file cache to ram.
>>
>
> In the case I'm talking about there are 23 OSDs running on a single machine
> and restarting all the OSDs causes a lot of peering and reading PG logs.
>
> A warm-up mechanism might work, but that would be a lot of work.
>
> When upgrading your cluster you simply want to do this:
>
> $ dsh -g ceph-osd "sudo restart ceph-osd-all delay=180"
>
> That might take hours to complete, but if it's just an upgrade that doesn't
> matter. You want as minimal impact on service as possible.
>

I may suggest to measure impact with vmtouch[0], it decreased OSD
startup time greatly on mine tests, but I was stuck with same resource
exhaustion as before after OSD marked itself up (IOPS ceiling
primarily).

0. http://hoytech.com/vmtouch/
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html