Just for my education, why letting the balancer moving the PGs to the new OSDs (CERN approach) is better than a throttled backfilling ?
Thanks, Massimo
On Sat, Jul 27, 2019 at 12:31 AM Stefan Kooman <stefan@xxxxxx> wrote:
Quoting Peter Sabaini (peter@xxxxxxxxxx):
> What kind of commit/apply latency increases have you seen when adding a
> large numbers of OSDs? I'm nervous how sensitive workloads might react
> here, esp. with spinners.
You mean when there is backfilling going on? Instead of doing "a big
bang" you can also use Dan van der Ster's trick with upmap balancer:
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py
See
https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer
So you would still have norebalance / nobackfill / norecover and ceph
balancer off. Then you run the script as many times as necessary to get
"HEALTH_OK" again (on clusters other than nautilus) and there a no more
PGs remapped. Unset the flags and enable the ceph balancer ... now the
balancer will slowly move PGs to the new OSDs.
We've used this trick to increase the number of PGs on a pool, and will
use this to expand the cluster in the near future.
This only works if you can use the balancer in "upmap" mode. Note that
using upmap requires that all clients be Luminous or newer. If you are
using cephfs kernel client it might report as not compatible (jewel) but
recent linux distributions work well (Ubuntu 18.04 / CentOS 7).
Gr. Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com