Re: How to add 100 new OSDs...

Peter Sabaini <peter@xxxxxxxxxx> · Fri, 26 Jul 2019 22:02:30 +0200

On 26.07.19 15:03, Stefan Kooman wrote:
> Quoting Peter Sabaini (peter@xxxxxxxxxx):
>> What kind of commit/apply latency increases have you seen when adding a
>> large numbers of OSDs? I'm nervous how sensitive workloads might react
>> here, esp. with spinners.
> 
> You mean when there is backfilling going on? Instead of doing "a big

Yes exactly. I usually tune down max rebalance and max recovery active
knobs to lessen impact but still I found the additional write load can
substantially increase i/o latencies. Not all workloads like this.

> bang" you can also use Dan van der Ster's trick with upmap balancer:
> https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py
> 
> See
> https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer

Thanks, thats interesting -- though I wish it weren't necessary.

cheers,
peter.

> So you would still have norebalance / nobackfill / norecover and ceph
> balancer off. Then you run the script as many times as necessary to get
> "HEALTH_OK" again (on clusters other than nautilus) and there a no more
> PGs remapped. Unset the flags and enable the ceph balancer ... now the
> balancer will slowly move PGs to the new OSDs.
> 
> We've used this trick to increase the number of PGs on a pool, and will
> use this to expand the cluster in the near future.
> 
> This only works if you can use the balancer in "upmap" mode. Note that
> using upmap requires that all clients be Luminous or newer. If you are
> using cephfs kernel client it might report as not compatible (jewel) but
> recent linux distributions work well (Ubuntu 18.04 / CentOS 7).
> 
> Gr. Stefan
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com