Re: scalability new node to the existing cluster

Serkan Çoban <cobanserkan@xxxxxxxxx> · Wed, 18 Apr 2018 18:23:54 +0300

>68 OSDs per node sounds an order of magnitude above what you should be doing, unless you have vast experience with Ceph and its memory requirements under stress.
I don't think so. We are also evaluating 90 OSDs per node. In order to
know it works you need to test all the scenarios. Redhat supports max
72 OSD per host. So they are still in support limits.

When QoS support arrives I hope we can put bandwidth limits to
recovery, otherwise we need to do what is acceptable and works for
now...

On Wed, Apr 18, 2018 at 5:50 PM, Hans van den Bogert
<hansbogert@xxxxxxxxx> wrote:
> I keep seeing these threads where adding nodes has such an impact on the cluster as a whole, that I wonder what the rest of the cluster looks like. Normally I’d just advise someone to put a limit on the concurrent backfills that can be done, and `osd max backfills` by default already is 1. Could it be that the real culprit here is that the hardware is heavily overbooked? 68 OSDs per node sounds an order of magnitude above what you should be doing, unless you have vast experience with Ceph and its memory requirements under stress.
> I wonder if this cluster would even come online after an outage, or would also crumble due to peering and possible backfilling.
>
> To be honest I don’t even get why using the weight option would solve this. The same amount of data needs to be transferred any way at one point; it seems like a poor-man’s throttling mechanism. And if memory shortage is the case here, due to, again, the many OSDs than the reweight strategy will only give you slightly better odds.
>
> So
> 1) I would keep track of memory usage on the nodes to see if that increases under peering/backfilling,
>   - If this is the case, and you’re using bluestore: try lowering bluestore_cache_size* params, to give you some leeway.
> 2) If using bluestore, try throttling by changing the following params, depending on your environment:
>   - osd recovery sleep
>   - osd recovery sleep hdd
>   - osd recovery sleep ssd
>
> There are other throttling params you can change, though most defaults are just fine in my environment, and I don’t have experience with them.
>
> Good luck,
>
> Hans
>
>
>> On Apr 18, 2018, at 1:32 PM, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
>>
>> You can add new OSDs with 0 weight and edit below script to increase
>> the osd weights instead of decreasing.
>>
>> https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentle-reweight
>>
>>
>> On Wed, Apr 18, 2018 at 2:16 PM, nokia ceph <nokiacephusers@xxxxxxxxx> wrote:
>>> Hi All,
>>>
>>> We are having 5 node cluster with EC 4+1 . Each node has 68 HDD . Now we are
>>> trying to add new node with 68 disks to the cluster .
>>>
>>> We tried to add new node and created all OSDs in one go , the cluster
>>> stopped all client traffic and does only backfilling .
>>>
>>> Any procedure to add the new node without affecting the client traffic ?
>>>
>>> If we create  OSDs one by one , then there is no issue in client traffic
>>> however  time taken to add new node with 68 disks will be several months.
>>>
>>> Please provide your suggestions..
>>>
>>> Thanks,
>>> Muthu
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com